Overview

Dataset statistics

Number of variables24
Number of observations500710
Missing cells5167226
Missing cells (%)43.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory482.8 MiB
Average record size in memory1011.0 B

Variable types

Numeric8
Categorical12
URL3
Unsupported1

Alerts

timestamp has a high cardinality: 492898 distinct valuesHigh cardinality
text_lang_ft has a high cardinality: 2648 distinct valuesHigh cardinality
text_normalized has a high cardinality: 355734 distinct valuesHigh cardinality
links has a high cardinality: 95588 distinct valuesHigh cardinality
hashtag has a high cardinality: 93371 distinct valuesHigh cardinality
hashtag_lang has a high cardinality: 2344 distinct valuesHigh cardinality
hashtag_en has a high cardinality: 92987 distinct valuesHigh cardinality
cashtag has a high cardinality: 415 distinct valuesHigh cardinality
media has a high cardinality: 104798 distinct valuesHigh cardinality
mentioned_users has a high cardinality: 62698 distinct valuesHigh cardinality
tweet_source has a high cardinality: 3579 distinct valuesHigh cardinality
credibility is highly imbalanced (76.9%)Imbalance
tweet_source is highly imbalanced (62.4%)Imbalance
links has 349025 (69.7%) missing valuesMissing
hashtag has 311862 (62.3%) missing valuesMissing
hashtag_lang has 311885 (62.3%) missing valuesMissing
hashtag_en has 311885 (62.3%) missing valuesMissing
cashtag has 499478 (99.8%) missing valuesMissing
media has 390417 (78.0%) missing valuesMissing
image_url has 396933 (79.3%) missing valuesMissing
video_url has 495073 (98.9%) missing valuesMissing
GIF_url has 499852 (99.8%) missing valuesMissing
reply_to_user has 433060 (86.5%) missing valuesMissing
mentioned_users has 347742 (69.4%) missing valuesMissing
quoted_tweet has 470842 (94.0%) missing valuesMissing
credibility has 349025 (69.7%) missing valuesMissing
likes is highly skewed (γ1 = 143.3287984)Skewed
retweets is highly skewed (γ1 = 167.9977401)Skewed
replies is highly skewed (γ1 = 403.2788348)Skewed
quoted_by_count is highly skewed (γ1 = 269.4598466)Skewed
timestamp is uniformly distributedUniform
media is uniformly distributedUniform
tweet_id has unique valuesUnique
reply_to_user is an unsupported type, check if it needs cleaning or further analysisUnsupported
sentiment_polarity has 210275 (42.0%) zerosZeros
likes has 314331 (62.8%) zerosZeros
retweets has 359328 (71.8%) zerosZeros
replies has 428227 (85.5%) zerosZeros
quoted_by_count has 460723 (92.0%) zerosZeros

Reproduction

Analysis started2023-04-06 20:42:03.028315
Analysis finished2023-04-06 20:43:32.792085
Duration1 minute and 29.76 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

user_id
Real number (ℝ)

Distinct97988
Distinct (%)19.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7750193 × 1017
Minimum521
Maximum1.46 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2023-04-06T22:43:32.896697image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum521
5-th percentile20751449
Q11.919453 × 108
median1.4098525 × 109
Q37.68 × 1017
95-th percentile1.2 × 1018
Maximum1.46 × 1018
Range1.46 × 1018
Interquartile range (IQR)7.68 × 1017

Descriptive statistics

Standard deviation4.5318511 × 1017
Coefficient of variation (CV)1.6330881
Kurtosis-0.47032404
Mean2.7750193 × 1017
Median Absolute Deviation (MAD)1.3651042 × 109
Skewness1.1320796
Sum7.11702 × 1018
Variance2.0537674 × 1035
MonotonicityNot monotonic
2023-04-06T22:43:33.046505image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.76 × 10173564
 
0.7%
1945935708 2908
 
0.6%
1.12 × 10182738
 
0.5%
9.95 × 10172715
 
0.5%
1.26 × 10182503
 
0.5%
1.22 × 10182348
 
0.5%
8.91 × 10172312
 
0.5%
1.09 × 10182239
 
0.4%
1.1 × 10182230
 
0.4%
8.84 × 10172187
 
0.4%
Other values (97978) 474966
94.9%
ValueCountFrequency (%)
521 1
 
< 0.1%
1378 1
 
< 0.1%
2397 5
< 0.1%
2806 1
 
< 0.1%
3249 2
 
< 0.1%
3334 1
 
< 0.1%
3336 1
 
< 0.1%
5618 2
 
< 0.1%
5658 1
 
< 0.1%
6664 2
 
< 0.1%
ValueCountFrequency (%)
1.46 × 101856
 
< 0.1%
1.45 × 1018117
 
< 0.1%
1.44 × 1018158
 
< 0.1%
1.43 × 1018277
0.1%
1.42 × 1018250
< 0.1%
1.41 × 1018299
0.1%
1.4 × 1018438
0.1%
1.39 × 1018440
0.1%
1.38 × 1018508
0.1%
1.37 × 1018610
0.1%

timestamp
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct492898
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Memory size39.2 MiB
2016-09-05 19:25:39+00:00
 
33
2017-01-14 11:36:59+00:00
 
32
2017-05-11 11:56:12+00:00
 
15
2017-05-11 10:33:34+00:00
 
15
2017-05-13 16:29:34+00:00
 
13
Other values (492893)
500602 

Length

Max length25
Median length25
Mean length25
Min length25

Characters and Unicode

Total characters12517750
Distinct characters14
Distinct categories5 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique486347 ?
Unique (%)97.1%

Sample

1st row2013-09-03 02:22:09+00:00
2nd row2013-09-03 02:22:11+00:00
3rd row2013-09-03 10:11:50+00:00
4th row2013-09-03 11:33:26+00:00
5th row2013-09-03 20:10:51+00:00

Common Values

ValueCountFrequency (%)
2016-09-05 19:25:39+00:00 33
 
< 0.1%
2017-01-14 11:36:59+00:00 32
 
< 0.1%
2017-05-11 11:56:12+00:00 15
 
< 0.1%
2017-05-11 10:33:34+00:00 15
 
< 0.1%
2017-05-13 16:29:34+00:00 13
 
< 0.1%
2016-05-19 12:36:29+00:00 11
 
< 0.1%
2017-05-14 18:23:36+00:00 10
 
< 0.1%
2017-05-14 03:22:09+00:00 10
 
< 0.1%
2017-05-14 05:13:09+00:00 10
 
< 0.1%
2014-09-16 09:07:16+00:00 10
 
< 0.1%
Other values (492888) 500551
> 99.9%

Length

2023-04-06T22:43:33.192379image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2017-05-14 9400
 
0.9%
2017-05-15 7029
 
0.7%
2019-04-26 4702
 
0.5%
2019-05-12 4515
 
0.5%
2017-05-13 3977
 
0.4%
2017-05-12 3728
 
0.4%
2017-05-16 3503
 
0.3%
2019-04-27 2967
 
0.3%
2019-04-25 2611
 
0.3%
2019-05-18 2278
 
0.2%
Other values (88510) 956710
95.5%

Most occurring characters

ValueCountFrequency (%)
0 3792242
30.3%
: 1502130
 
12.0%
1 1389842
 
11.1%
2 1276079
 
10.2%
- 1001420
 
8.0%
500710
 
4.0%
+ 500710
 
4.0%
5 462279
 
3.7%
3 434866
 
3.5%
4 414797
 
3.3%
Other values (4) 1242675
 
9.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9012780
72.0%
Other Punctuation 1502130
 
12.0%
Dash Punctuation 1001420
 
8.0%
Space Separator 500710
 
4.0%
Math Symbol 500710
 
4.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3792242
42.1%
1 1389842
 
15.4%
2 1276079
 
14.2%
5 462279
 
5.1%
3 434866
 
4.8%
4 414797
 
4.6%
7 331685
 
3.7%
9 330297
 
3.7%
8 315500
 
3.5%
6 265193
 
2.9%
Other Punctuation
ValueCountFrequency (%)
: 1502130
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1001420
100.0%
Space Separator
ValueCountFrequency (%)
500710
100.0%
Math Symbol
ValueCountFrequency (%)
+ 500710
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12517750
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3792242
30.3%
: 1502130
 
12.0%
1 1389842
 
11.1%
2 1276079
 
10.2%
- 1001420
 
8.0%
500710
 
4.0%
+ 500710
 
4.0%
5 462279
 
3.7%
3 434866
 
3.5%
4 414797
 
3.3%
Other values (4) 1242675
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12517750
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3792242
30.3%
: 1502130
 
12.0%
1 1389842
 
11.1%
2 1276079
 
10.2%
- 1001420
 
8.0%
500710
 
4.0%
+ 500710
 
4.0%
5 462279
 
3.7%
3 434866
 
3.5%
4 414797
 
3.3%
Other values (4) 1242675
 
9.9%

tweet_id
Real number (ℝ)

Distinct500710
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0445043 × 1018
Minimum3.7471893 × 1017
Maximum1.4654691 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2023-04-06T22:43:33.327690image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum3.7471893 × 1017
5-th percentile6.222359 × 1017
Q18.6592314 × 1017
median1.0553803 × 1018
Q31.2169744 × 1018
95-th percentile1.4099342 × 1018
Maximum1.4654691 × 1018
Range1.0907502 × 1018
Interquartile range (IQR)3.5105126 × 1017

Descriptive statistics

Standard deviation2.3015116 × 1017
Coefficient of variation (CV)0.22034487
Kurtosis-0.46774291
Mean1.0445043 × 1018
Median Absolute Deviation (MAD)1.8221344 × 1017
Skewness-0.23110444
Sum-8.3502747 × 1018
Variance5.2969558 × 1034
MonotonicityNot monotonic
2023-04-06T22:43:33.483066image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.747189287 × 10171
 
< 0.1%
1.134006431 × 10181
 
< 0.1%
1.133995094 × 10181
 
< 0.1%
1.133993357 × 10181
 
< 0.1%
1.133992337 × 10181
 
< 0.1%
1.133991685 × 10181
 
< 0.1%
1.133990799 × 10181
 
< 0.1%
1.133990297 × 10181
 
< 0.1%
1.133983426 × 10181
 
< 0.1%
1.133980361 × 10181
 
< 0.1%
Other values (500700) 500700
> 99.9%
ValueCountFrequency (%)
3.747189287 × 10171
< 0.1%
3.747189379 × 10171
< 0.1%
3.748371279 × 10171
< 0.1%
3.748576657 × 10171
< 0.1%
3.749878767 × 10171
< 0.1%
3.749929245 × 10171
< 0.1%
3.750220232 × 10171
< 0.1%
3.750380101 × 10171
< 0.1%
3.750467486 × 10171
< 0.1%
3.750500781 × 10171
< 0.1%
ValueCountFrequency (%)
1.465469131 × 10181
< 0.1%
1.465467688 × 10181
< 0.1%
1.46546752 × 10181
< 0.1%
1.46546751 × 10181
< 0.1%
1.465465214 × 10181
< 0.1%
1.465464424 × 10181
< 0.1%
1.465458763 × 10181
< 0.1%
1.465458264 × 10181
< 0.1%
1.465454732 × 10181
< 0.1%
1.465452411 × 10181
< 0.1%

sentiment_polarity
Real number (ℝ)

Distinct10228
Distinct (%)2.0%
Missing49
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.10509341
Minimum-1
Maximum0.9992
Zeros210275
Zeros (%)42.0%
Negative90044
Negative (%)18.0%
Memory size3.8 MiB
2023-04-06T22:43:33.634597image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-0.5719
Q10
median0
Q30.3818
95-th percentile0.7269
Maximum0.9992
Range1.9992
Interquartile range (IQR)0.3818

Descriptive statistics

Standard deviation0.36287641
Coefficient of variation (CV)3.4528942
Kurtosis0.1553445
Mean0.10509341
Median Absolute Deviation (MAD)0.2023
Skewness-0.13871053
Sum52616.169
Variance0.13167929
MonotonicityNot monotonic
2023-04-06T22:43:33.780066image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 210275
42.0%
0.4019 12194
 
2.4%
0.3818 9643
 
1.9%
0.296 6924
 
1.4%
0.2732 6375
 
1.3%
0.3612 6354
 
1.3%
0.3182 6211
 
1.2%
0.4215 6174
 
1.2%
0.4404 5901
 
1.2%
0.34 5894
 
1.2%
Other values (10218) 224716
44.9%
ValueCountFrequency (%)
-1 1
< 0.1%
-0.9989 1
< 0.1%
-0.9987 1
< 0.1%
-0.9984 1
< 0.1%
-0.9977 1
< 0.1%
-0.9948 1
< 0.1%
-0.9928 2
< 0.1%
-0.9897 1
< 0.1%
-0.9896 1
< 0.1%
-0.9883 1
< 0.1%
ValueCountFrequency (%)
0.9992 1
 
< 0.1%
0.9985 1
 
< 0.1%
0.9982 1
 
< 0.1%
0.998 1
 
< 0.1%
0.9979 1
 
< 0.1%
0.997 1
 
< 0.1%
0.9965 1
 
< 0.1%
0.9964 1
 
< 0.1%
0.9963 3
< 0.1%
0.9955 1
 
< 0.1%

text_lang_ft
Categorical

Distinct2648
Distinct (%)0.5%
Missing49
Missing (%)< 0.1%
Memory size29.6 MiB
en 91
 
15340
en 90
 
15193
en 92
 
14877
en 89
 
14795
en 93
 
14699
Other values (2643)
425757 

Length

Max length6
Median length5
Mean length5.0003795
Min length4

Characters and Unicode

Total characters2503495
Distinct characters36
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique853 ?
Unique (%)0.2%

Sample

1st rowen 88
2nd rowen 84
3rd rowen 47
4th rowen 65
5th rowen 56

Common Values

ValueCountFrequency (%)
en 91 15340
 
3.1%
en 90 15193
 
3.0%
en 92 14877
 
3.0%
en 89 14795
 
3.0%
en 93 14699
 
2.9%
en 88 14358
 
2.9%
en 94 14280
 
2.9%
en 87 13998
 
2.8%
en 86 13738
 
2.7%
en 85 13246
 
2.6%
Other values (2638) 356137
71.1%

Length

2023-04-06T22:43:33.945535image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
en 462168
46.2%
91 16041
 
1.6%
90 15819
 
1.6%
92 15661
 
1.6%
93 15551
 
1.6%
89 15356
 
1.5%
94 15234
 
1.5%
88 14877
 
1.5%
87 14559
 
1.5%
86 14255
 
1.4%
Other values (173) 401801
40.1%

Most occurring characters

ValueCountFrequency (%)
500661
20.0%
e 468231
18.7%
n 463210
18.5%
8 186528
 
7.5%
9 173046
 
6.9%
7 150633
 
6.0%
6 115737
 
4.6%
5 89808
 
3.6%
4 72826
 
2.9%
3 62350
 
2.5%
Other values (26) 220465
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1001496
40.0%
Lowercase Letter 1001338
40.0%
Space Separator 500661
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 468231
46.8%
n 463210
46.3%
i 16113
 
1.6%
d 12273
 
1.2%
t 11898
 
1.2%
r 7261
 
0.7%
s 4175
 
0.4%
a 2734
 
0.3%
h 2665
 
0.3%
f 2473
 
0.2%
Other values (15) 10305
 
1.0%
Decimal Number
ValueCountFrequency (%)
8 186528
18.6%
9 173046
17.3%
7 150633
15.0%
6 115737
11.6%
5 89808
9.0%
4 72826
 
7.3%
3 62350
 
6.2%
2 53843
 
5.4%
1 49634
 
5.0%
0 47091
 
4.7%
Space Separator
ValueCountFrequency (%)
500661
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1502157
60.0%
Latin 1001338
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 468231
46.8%
n 463210
46.3%
i 16113
 
1.6%
d 12273
 
1.2%
t 11898
 
1.2%
r 7261
 
0.7%
s 4175
 
0.4%
a 2734
 
0.3%
h 2665
 
0.3%
f 2473
 
0.2%
Other values (15) 10305
 
1.0%
Common
ValueCountFrequency (%)
500661
33.3%
8 186528
 
12.4%
9 173046
 
11.5%
7 150633
 
10.0%
6 115737
 
7.7%
5 89808
 
6.0%
4 72826
 
4.8%
3 62350
 
4.2%
2 53843
 
3.6%
1 49634
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2503495
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
500661
20.0%
e 468231
18.7%
n 463210
18.5%
8 186528
 
7.5%
9 173046
 
6.9%
7 150633
 
6.0%
6 115737
 
4.6%
5 89808
 
3.6%
4 72826
 
2.9%
3 62350
 
2.5%
Other values (26) 220465
8.8%

text_normalized
Categorical

Distinct355734
Distinct (%)71.1%
Missing49
Missing (%)< 0.1%
Memory size93.9 MiB
['china', 'new', 'silk', 'road']
 
833
['one', 'belt', 'one', 'road']
 
726
['belt', 'road']
 
542
['new', 'silk', 'road']
 
523
['beltandroad']
 
422
Other values (355729)
497615 

Length

Max length2853
Median length554
Mean length135.6264
Min length2

Characters and Unicode

Total characters67902849
Distinct characters3906
Distinct categories12 ?
Distinct scripts32 ?
Distinct blocks41 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique319884 ?
Unique (%)63.9%

Sample

1st row['nation', 'agree', 'build', 'new', 'silk', 'road', 'china', 'enhance', 'partnership', 'neighbor', 'west', 'aim']
2nd row['nation', 'agree', 'build', 'new', 'silk', 'road', 'china', 'enhance', 'partnership', 'neighbor', 'west', 'aim']
3rd row['high', 'speed', 'rail', 'china', 'new', 'silk', 'road', 'perspective']
4th row['nation', 'agree', 'build', 'new', 'silk', 'road']
5th row['china', 'kazakhstan', 'tajikistan', 'russia', 'mongolia', 'build', 'new', 'silk', 'road']

Common Values

ValueCountFrequency (%)
['china', 'new', 'silk', 'road'] 833
 
0.2%
['one', 'belt', 'one', 'road'] 726
 
0.1%
['belt', 'road'] 542
 
0.1%
['new', 'silk', 'road'] 523
 
0.1%
['beltandroad'] 422
 
0.1%
['belt', 'road', 'initiative'] 377
 
0.1%
[] 365
 
0.1%
['chinas', 'belt', 'road', 'plan', 'pakistan', 'take', 'military', 'turn'] 362
 
0.1%
['china', 'invest', '124bn', 'belt', 'road', 'global', 'trade', 'project'] 289
 
0.1%
['china', '900', 'billion', 'new', 'silk', 'road', 'need', 'know'] 287
 
0.1%
Other values (355724) 495935
99.0%

Length

2023-04-06T22:43:34.137743image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
road 431688
 
6.3%
belt 328416
 
4.8%
china 271855
 
4.0%
one 156334
 
2.3%
initiative 140139
 
2.0%
new 104540
 
1.5%
silk 104467
 
1.5%
beltandroad 83517
 
1.2%
project 49189
 
0.7%
chinese 47783
 
0.7%
Other values (182788) 5137511
74.9%

Most occurring characters

ValueCountFrequency (%)
' 13710148
20.2%
, 6354778
 
9.4%
6354778
 
9.4%
e 4371211
 
6.4%
a 3895533
 
5.7%
i 3675622
 
5.4%
n 3272107
 
4.8%
t 3061379
 
4.5%
o 2972794
 
4.4%
r 2874467
 
4.2%
Other values (3896) 17360032
25.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 39962011
58.9%
Other Punctuation 20064926
29.5%
Space Separator 6354778
 
9.4%
Open Punctuation 500661
 
0.7%
Close Punctuation 500661
 
0.7%
Decimal Number 384214
 
0.6%
Other Letter 115550
 
0.2%
Connector Punctuation 10055
 
< 0.1%
Uppercase Letter 9590
 
< 0.1%
Modifier Letter 190
 
< 0.1%
Other values (2) 213
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3932
 
3.4%
ا 3725
 
3.2%
ی 2520
 
2.2%
2220
 
1.9%
ر 2204
 
1.9%
2070
 
1.8%
1951
 
1.7%
ن 1733
 
1.5%
1709
 
1.5%
1645
 
1.4%
Other values (3391) 91841
79.5%
Lowercase Letter
ValueCountFrequency (%)
e 4371211
10.9%
a 3895533
 
9.7%
i 3675622
 
9.2%
n 3272107
 
8.2%
t 3061379
 
7.7%
o 2972794
 
7.4%
r 2874467
 
7.2%
l 2027982
 
5.1%
s 1833421
 
4.6%
c 1763768
 
4.4%
Other values (323) 10213727
25.6%
Decimal Number
ValueCountFrequency (%)
1 80799
21.0%
0 79994
20.8%
2 77291
20.1%
3 26095
 
6.8%
9 24618
 
6.4%
5 22324
 
5.8%
7 20952
 
5.5%
4 19823
 
5.2%
8 16127
 
4.2%
6 15336
 
4.0%
Other values (68) 855
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
I 3708
38.7%
𝗔 889
 
9.3%
𝗧 716
 
7.5%
𝗖 532
 
5.5%
𝗬 489
 
5.1%
𝗦 485
 
5.1%
𝗢 484
 
5.0%
𝗣 468
 
4.9%
𝗨 288
 
3.0%
𝗗 282
 
2.9%
Other values (47) 1249
 
13.0%
Other Number
ValueCountFrequency (%)
11
 
10.1%
11
 
10.1%
11
 
10.1%
11
 
10.1%
11
 
10.1%
² 5
 
4.6%
5
 
4.6%
½ 4
 
3.7%
4
 
3.7%
4
 
3.7%
Other values (14) 32
29.4%
Modifier Letter
ValueCountFrequency (%)
135
71.1%
48
 
25.3%
3
 
1.6%
ʽ 2
 
1.1%
ˈ 1
 
0.5%
1
 
0.5%
Other Punctuation
ValueCountFrequency (%)
' 13710148
68.3%
, 6354778
31.7%
Space Separator
ValueCountFrequency (%)
6354778
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 500661
100.0%
Close Punctuation
ValueCountFrequency (%)
] 500661
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 10055
100.0%
Nonspacing Mark
ValueCountFrequency (%)
̇ 104
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 39917656
58.8%
Common 27825292
41.0%
Greek 34183
 
0.1%
Han 31677
 
< 0.1%
Arabic 26015
 
< 0.1%
Thai 24318
 
< 0.1%
Cyrillic 9420
 
< 0.1%
Devanagari 8179
 
< 0.1%
Myanmar 7687
 
< 0.1%
Hebrew 4320
 
< 0.1%
Other values (22) 14102
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
3932
 
12.4%
2070
 
6.5%
1645
 
5.2%
761
 
2.4%
753
 
2.4%
458
 
1.4%
391
 
1.2%
304
 
1.0%
247
 
0.8%
196
 
0.6%
Other values (2331) 20920
66.0%
Common
ValueCountFrequency (%)
' 13710148
49.3%
, 6354778
22.8%
6354778
22.8%
[ 500661
 
1.8%
] 500661
 
1.8%
1 80799
 
0.3%
0 79994
 
0.3%
2 77291
 
0.3%
3 26095
 
0.1%
9 24618
 
0.1%
Other values (198) 115469
 
0.4%
Arabic
ValueCountFrequency (%)
ا 3725
14.3%
ی 2520
 
9.7%
ر 2204
 
8.5%
ن 1733
 
6.7%
و 1484
 
5.7%
ه 1292
 
5.0%
د 1277
 
4.9%
م 1239
 
4.8%
ت 1235
 
4.7%
ب 992
 
3.8%
Other values (141) 8314
32.0%
Latin
ValueCountFrequency (%)
e 4371211
11.0%
a 3895533
 
9.8%
i 3675622
 
9.2%
n 3272107
 
8.2%
t 3061379
 
7.7%
o 2972794
 
7.4%
r 2874467
 
7.2%
l 2027982
 
5.1%
s 1833421
 
4.6%
c 1763768
 
4.4%
Other values (137) 10169372
25.5%
Hangul
ValueCountFrequency (%)
47
 
9.3%
43
 
8.5%
39
 
7.7%
37
 
7.3%
14
 
2.8%
14
 
2.8%
13
 
2.6%
13
 
2.6%
12
 
2.4%
10
 
2.0%
Other values (124) 264
52.2%
Ethiopic
ValueCountFrequency (%)
35
 
5.5%
35
 
5.5%
28
 
4.4%
23
 
3.6%
20
 
3.1%
19
 
3.0%
18
 
2.8%
17
 
2.7%
17
 
2.7%
16
 
2.5%
Other values (109) 407
64.1%
Katakana
ValueCountFrequency (%)
53
 
8.3%
53
 
8.3%
50
 
7.8%
43
 
6.7%
43
 
6.7%
38
 
5.9%
37
 
5.8%
27
 
4.2%
17
 
2.7%
12
 
1.9%
Other values (63) 268
41.8%
Hiragana
ValueCountFrequency (%)
69
 
10.9%
43
 
6.8%
36
 
5.7%
34
 
5.4%
29
 
4.6%
28
 
4.4%
26
 
4.1%
26
 
4.1%
20
 
3.1%
20
 
3.1%
Other values (53) 304
47.9%
Myanmar
ValueCountFrequency (%)
858
11.2%
785
 
10.2%
က 724
 
9.4%
676
 
8.8%
533
 
6.9%
514
 
6.7%
457
 
5.9%
374
 
4.9%
358
 
4.7%
284
 
3.7%
Other values (49) 2124
27.6%
Thai
ValueCountFrequency (%)
2220
 
9.1%
1951
 
8.0%
1709
 
7.0%
1306
 
5.4%
1227
 
5.0%
1198
 
4.9%
1124
 
4.6%
1117
 
4.6%
1010
 
4.2%
916
 
3.8%
Other values (44) 10540
43.3%
Devanagari
ValueCountFrequency (%)
958
 
11.7%
927
 
11.3%
737
 
9.0%
462
 
5.6%
455
 
5.6%
438
 
5.4%
374
 
4.6%
357
 
4.4%
286
 
3.5%
283
 
3.5%
Other values (43) 2902
35.5%
Cyrillic
ValueCountFrequency (%)
а 1078
 
11.4%
н 685
 
7.3%
о 557
 
5.9%
э 511
 
5.4%
р 495
 
5.3%
л 481
 
5.1%
и 476
 
5.1%
д 440
 
4.7%
г 429
 
4.6%
т 397
 
4.2%
Other values (41) 3871
41.1%
Khmer
ValueCountFrequency (%)
117
 
10.5%
100
 
9.0%
89
 
8.0%
81
 
7.3%
72
 
6.5%
63
 
5.7%
60
 
5.4%
60
 
5.4%
57
 
5.1%
45
 
4.0%
Other values (33) 371
33.3%
Sinhala
ValueCountFrequency (%)
474
12.1%
353
 
9.0%
325
 
8.3%
316
 
8.1%
287
 
7.3%
268
 
6.9%
266
 
6.8%
200
 
5.1%
158
 
4.0%
152
 
3.9%
Other values (31) 1107
28.3%
Lao
ValueCountFrequency (%)
134
 
11.3%
116
 
9.8%
84
 
7.1%
66
 
5.6%
63
 
5.3%
61
 
5.1%
60
 
5.1%
57
 
4.8%
56
 
4.7%
44
 
3.7%
Other values (30) 446
37.6%
Greek
ValueCountFrequency (%)
α 3120
 
9.1%
ο 3006
 
8.8%
τ 2797
 
8.2%
ι 2224
 
6.5%
ε 2199
 
6.4%
ν 1945
 
5.7%
σ 1725
 
5.0%
ρ 1606
 
4.7%
η 1482
 
4.3%
ς 1339
 
3.9%
Other values (25) 12740
37.3%
Bengali
ValueCountFrequency (%)
28
 
10.8%
24
 
9.3%
19
 
7.3%
19
 
7.3%
18
 
6.9%
16
 
6.2%
15
 
5.8%
12
 
4.6%
10
 
3.9%
9
 
3.5%
Other values (25) 89
34.4%
Tamil
ValueCountFrequency (%)
499
13.2%
483
12.8%
369
9.8%
290
 
7.7%
230
 
6.1%
222
 
5.9%
210
 
5.5%
200
 
5.3%
193
 
5.1%
183
 
4.8%
Other values (24) 905
23.9%
Oriya
ValueCountFrequency (%)
32
17.6%
16
 
8.8%
13
 
7.1%
10
 
5.5%
10
 
5.5%
10
 
5.5%
7
 
3.8%
6
 
3.3%
5
 
2.7%
5
 
2.7%
Other values (24) 68
37.4%
Gujarati
ValueCountFrequency (%)
56
18.8%
34
 
11.4%
21
 
7.0%
18
 
6.0%
18
 
6.0%
14
 
4.7%
13
 
4.4%
11
 
3.7%
10
 
3.4%
9
 
3.0%
Other values (20) 94
31.5%
Kannada
ValueCountFrequency (%)
36
 
12.4%
28
 
9.6%
25
 
8.6%
20
 
6.9%
15
 
5.2%
13
 
4.5%
12
 
4.1%
12
 
4.1%
12
 
4.1%
12
 
4.1%
Other values (20) 106
36.4%
Tibetan
ValueCountFrequency (%)
42
20.9%
24
11.9%
21
10.4%
20
10.0%
11
 
5.5%
11
 
5.5%
10
 
5.0%
9
 
4.5%
9
 
4.5%
7
 
3.5%
Other values (18) 37
18.4%
Hebrew
ValueCountFrequency (%)
י 527
 
12.2%
ו 411
 
9.5%
ה 398
 
9.2%
ל 298
 
6.9%
ת 243
 
5.6%
ר 238
 
5.5%
מ 238
 
5.5%
ש 223
 
5.2%
א 218
 
5.0%
ב 191
 
4.4%
Other values (17) 1335
30.9%
Telugu
ValueCountFrequency (%)
35
13.4%
32
12.2%
21
 
8.0%
18
 
6.9%
17
 
6.5%
15
 
5.7%
13
 
5.0%
13
 
5.0%
12
 
4.6%
10
 
3.8%
Other values (15) 76
29.0%
Syloti_Nagri
ValueCountFrequency (%)
6
18.2%
4
12.1%
3
9.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
1
 
3.0%
Other values (7) 7
21.2%
Canadian_Aboriginal
ValueCountFrequency (%)
3
13.0%
3
13.0%
2
 
8.7%
2
 
8.7%
2
 
8.7%
2
 
8.7%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
Other values (5) 5
21.7%
Mongolian
ValueCountFrequency (%)
3
37.5%
2
25.0%
1
 
12.5%
1
 
12.5%
1
 
12.5%
Georgian
ValueCountFrequency (%)
2
28.6%
2
28.6%
1
14.3%
1
14.3%
1
14.3%
Armenian
ValueCountFrequency (%)
է 12
75.0%
ե 2
 
12.5%
հ 1
 
6.2%
պ 1
 
6.2%
Malayalam
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Inherited
ValueCountFrequency (%)
̇ 104
100.0%
Bopomofo
ValueCountFrequency (%)
6
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 67717532
99.7%
None 49016
 
0.1%
CJK 31673
 
< 0.1%
Arabic 25710
 
< 0.1%
Thai 24318
 
< 0.1%
Math Alphanum 10535
 
< 0.1%
Cyrillic 9420
 
< 0.1%
Devanagari 8179
 
< 0.1%
Myanmar 7687
 
< 0.1%
Hebrew 4320
 
< 0.1%
Other values (31) 14459
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
' 13710148
20.2%
, 6354778
 
9.4%
6354778
 
9.4%
e 4371211
 
6.5%
a 3895533
 
5.8%
i 3675622
 
5.4%
n 3272107
 
4.8%
t 3061379
 
4.5%
o 2972794
 
4.4%
r 2874467
 
4.2%
Other values (33) 17174715
25.4%
CJK
ValueCountFrequency (%)
3932
 
12.4%
2070
 
6.5%
1645
 
5.2%
761
 
2.4%
753
 
2.4%
458
 
1.4%
391
 
1.2%
304
 
1.0%
247
 
0.8%
196
 
0.6%
Other values (2329) 20916
66.0%
Arabic
ValueCountFrequency (%)
ا 3725
14.5%
ی 2520
 
9.8%
ر 2204
 
8.6%
ن 1733
 
6.7%
و 1484
 
5.8%
ه 1292
 
5.0%
د 1277
 
5.0%
م 1239
 
4.8%
ت 1235
 
4.8%
ب 992
 
3.9%
Other values (69) 8009
31.2%
None
ValueCountFrequency (%)
α 3120
 
6.4%
ο 3006
 
6.1%
τ 2797
 
5.7%
ι 2224
 
4.5%
ε 2199
 
4.5%
ν 1945
 
4.0%
σ 1725
 
3.5%
í 1642
 
3.3%
ã 1619
 
3.3%
é 1615
 
3.3%
Other values (209) 27124
55.3%
Thai
ValueCountFrequency (%)
2220
 
9.1%
1951
 
8.0%
1709
 
7.0%
1306
 
5.4%
1227
 
5.0%
1198
 
4.9%
1124
 
4.6%
1117
 
4.6%
1010
 
4.2%
916
 
3.8%
Other values (44) 10540
43.3%
Cyrillic
ValueCountFrequency (%)
а 1078
 
11.4%
н 685
 
7.3%
о 557
 
5.9%
э 511
 
5.4%
р 495
 
5.3%
л 481
 
5.1%
и 476
 
5.1%
д 440
 
4.7%
г 429
 
4.6%
т 397
 
4.2%
Other values (41) 3871
41.1%
Devanagari
ValueCountFrequency (%)
958
 
11.7%
927
 
11.3%
737
 
9.0%
462
 
5.6%
455
 
5.6%
438
 
5.4%
374
 
4.6%
357
 
4.4%
286
 
3.5%
283
 
3.5%
Other values (43) 2902
35.5%
Math Alphanum
ValueCountFrequency (%)
𝗔 889
 
8.4%
𝗧 716
 
6.8%
𝗲 682
 
6.5%
𝘀 666
 
6.3%
𝗖 532
 
5.0%
𝗬 489
 
4.6%
𝗦 485
 
4.6%
𝗢 484
 
4.6%
𝗣 468
 
4.4%
𝗮 376
 
3.6%
Other values (144) 4748
45.1%
Myanmar
ValueCountFrequency (%)
858
11.2%
785
 
10.2%
က 724
 
9.4%
676
 
8.8%
533
 
6.9%
514
 
6.7%
457
 
5.9%
374
 
4.9%
358
 
4.7%
284
 
3.7%
Other values (49) 2124
27.6%
Hebrew
ValueCountFrequency (%)
י 527
 
12.2%
ו 411
 
9.5%
ה 398
 
9.2%
ל 298
 
6.9%
ת 243
 
5.6%
ר 238
 
5.5%
מ 238
 
5.5%
ש 223
 
5.2%
א 218
 
5.0%
ב 191
 
4.4%
Other values (17) 1335
30.9%
Tamil
ValueCountFrequency (%)
499
13.2%
483
12.8%
369
9.8%
290
 
7.7%
230
 
6.1%
222
 
5.9%
210
 
5.5%
200
 
5.3%
193
 
5.1%
183
 
4.8%
Other values (24) 905
23.9%
Sinhala
ValueCountFrequency (%)
474
12.1%
353
 
9.0%
325
 
8.3%
316
 
8.1%
287
 
7.3%
268
 
6.9%
266
 
6.8%
200
 
5.1%
158
 
4.0%
152
 
3.9%
Other values (31) 1107
28.3%
Katakana
ValueCountFrequency (%)
135
17.6%
53
 
6.9%
53
 
6.9%
50
 
6.5%
43
 
5.6%
43
 
5.6%
38
 
4.9%
37
 
4.8%
27
 
3.5%
17
 
2.2%
Other values (59) 272
35.4%
Lao
ValueCountFrequency (%)
134
 
11.3%
116
 
9.8%
84
 
7.1%
66
 
5.6%
63
 
5.3%
61
 
5.1%
60
 
5.1%
57
 
4.8%
56
 
4.7%
44
 
3.7%
Other values (30) 446
37.6%
Khmer
ValueCountFrequency (%)
117
 
10.5%
100
 
9.0%
89
 
8.0%
81
 
7.3%
72
 
6.5%
63
 
5.7%
60
 
5.4%
60
 
5.4%
57
 
5.1%
45
 
4.0%
Other values (33) 371
33.3%
Diacriticals
ValueCountFrequency (%)
̇ 104
100.0%
Hiragana
ValueCountFrequency (%)
69
 
10.9%
43
 
6.8%
36
 
5.7%
34
 
5.4%
29
 
4.6%
28
 
4.4%
26
 
4.1%
26
 
4.1%
20
 
3.1%
20
 
3.1%
Other values (53) 304
47.9%
Gujarati
ValueCountFrequency (%)
56
18.8%
34
 
11.4%
21
 
7.0%
18
 
6.0%
18
 
6.0%
14
 
4.7%
13
 
4.4%
11
 
3.7%
10
 
3.4%
9
 
3.0%
Other values (20) 94
31.5%
Hangul
ValueCountFrequency (%)
47
 
9.3%
43
 
8.5%
39
 
7.7%
37
 
7.3%
14
 
2.8%
14
 
2.8%
13
 
2.6%
13
 
2.6%
12
 
2.4%
10
 
2.0%
Other values (124) 264
52.2%
Letterlike Symbols
ValueCountFrequency (%)
45
86.5%
7
 
13.5%
Tibetan
ValueCountFrequency (%)
42
20.9%
24
11.9%
21
10.4%
20
10.0%
11
 
5.5%
11
 
5.5%
10
 
5.0%
9
 
4.5%
9
 
4.5%
7
 
3.5%
Other values (18) 37
18.4%
Kannada
ValueCountFrequency (%)
36
 
12.4%
28
 
9.6%
25
 
8.6%
20
 
6.9%
15
 
5.2%
13
 
4.5%
12
 
4.1%
12
 
4.1%
12
 
4.1%
12
 
4.1%
Other values (20) 106
36.4%
Ethiopic
ValueCountFrequency (%)
35
 
5.5%
35
 
5.5%
28
 
4.4%
23
 
3.6%
20
 
3.1%
19
 
3.0%
18
 
2.8%
17
 
2.7%
17
 
2.7%
16
 
2.5%
Other values (109) 407
64.1%
Telugu
ValueCountFrequency (%)
35
13.4%
32
12.2%
21
 
8.0%
18
 
6.9%
17
 
6.5%
15
 
5.7%
13
 
5.0%
13
 
5.0%
12
 
4.6%
10
 
3.8%
Other values (15) 76
29.0%
Oriya
ValueCountFrequency (%)
32
17.6%
16
 
8.8%
13
 
7.1%
10
 
5.5%
10
 
5.5%
10
 
5.5%
7
 
3.8%
6
 
3.3%
5
 
2.7%
5
 
2.7%
Other values (24) 68
37.4%
Bengali
ValueCountFrequency (%)
28
 
10.8%
24
 
9.3%
19
 
7.3%
19
 
7.3%
18
 
6.9%
16
 
6.2%
15
 
5.8%
12
 
4.6%
10
 
3.9%
9
 
3.5%
Other values (25) 89
34.4%
Latin Ext Additional
ValueCountFrequency (%)
ế 18
27.7%
9
13.8%
6
 
9.2%
6
 
9.2%
3
 
4.6%
3
 
4.6%
3
 
4.6%
2
 
3.1%
2
 
3.1%
2
 
3.1%
Other values (10) 11
16.9%
Armenian
ValueCountFrequency (%)
է 12
75.0%
ե 2
 
12.5%
հ 1
 
6.2%
պ 1
 
6.2%
Enclosed Alphanum
ValueCountFrequency (%)
11
12.1%
11
12.1%
11
12.1%
11
12.1%
11
12.1%
5
 
5.5%
4
 
4.4%
4
 
4.4%
3
 
3.3%
3
 
3.3%
Other values (7) 17
18.7%
Syloti Nagri
ValueCountFrequency (%)
6
18.2%
4
12.1%
3
9.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
1
 
3.0%
Other values (7) 7
21.2%
Bopomofo
ValueCountFrequency (%)
6
100.0%
Alphabetic PF
ValueCountFrequency (%)
4
100.0%
UCAS
ValueCountFrequency (%)
3
13.0%
3
13.0%
2
 
8.7%
2
 
8.7%
2
 
8.7%
2
 
8.7%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
Other values (5) 5
21.7%
IPA Ext
ValueCountFrequency (%)
ə 3
42.9%
ʃ 1
 
14.3%
ʊ 1
 
14.3%
ʌ 1
 
14.3%
ʘ 1
 
14.3%
Mongolian
ValueCountFrequency (%)
3
37.5%
2
25.0%
1
 
12.5%
1
 
12.5%
1
 
12.5%
Dingbats
ValueCountFrequency (%)
2
33.3%
2
33.3%
2
33.3%
Modifier Letters
ValueCountFrequency (%)
ʽ 2
66.7%
ˈ 1
33.3%
Georgian
ValueCountFrequency (%)
2
28.6%
2
28.6%
1
14.3%
1
14.3%
1
14.3%
Malayalam
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
CJK Ext B
ValueCountFrequency (%)
𠯆 1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

links
Categorical

HIGH CARDINALITY  MISSING 

Distinct95588
Distinct (%)63.0%
Missing349025
Missing (%)69.7%
Memory size28.9 MiB
https://www.reuters.com/?edition-redirect=uk
 
328
https://nyti.ms/2GutBQB
 
282
https://www.youtube.com/watch?v=cUxw9Re-Z-E&feature=youtu.be
 
194
http://English.news
 
190
https://www.bbc.co.uk/news/world-asia-39912671
 
185
Other values (95583)
150506 

Length

Max length1115
Median length630
Mean length69.141761
Min length11

Characters and Unicode

Total characters10487768
Distinct characters92
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80106 ?
Unique (%)52.8%

Sample

1st rowhttp://bit.ly/17lyTPM
2nd rowhttp://bit.ly/17lySv6
3rd rowhttp://usa.chinadaily.com.cn/epaper/2013-09/03/content_16940556.htm
4th rowhttp://usa.chinadaily.com.cn/epaper/2013-09/03/content_16940556.htm
5th rowhttp://buff.ly/18qBTwC

Common Values

ValueCountFrequency (%)
https://www.reuters.com/?edition-redirect=uk 328
 
0.1%
https://nyti.ms/2GutBQB 282
 
0.1%
https://www.youtube.com/watch?v=cUxw9Re-Z-E&feature=youtu.be 194
 
< 0.1%
http://English.news 190
 
< 0.1%
https://www.bbc.co.uk/news/world-asia-39912671 185
 
< 0.1%
https://www.lavoceditrieste.net/2015/10/31/does-rome-want-to-eliminate-the-european-singapore/ 177
 
< 0.1%
https://sc.mp/2rCpF5x 176
 
< 0.1%
https://www.reuters.com/?edition-redirect=in 176
 
< 0.1%
https://www.lavoceditrieste.net/2015/10/31/roma-vuole-eliminare-la-singapore-deuropa/ 170
 
< 0.1%
https://edition.cnn.com/2017/05/13/asia/china-belt-and-road-forum-xi-putin-erdogan/index.html 162
 
< 0.1%
Other values (95578) 149645
29.9%
(Missing) 349025
69.7%

Length

2023-04-06T22:43:34.322914image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
https://www.reuters.com/?edition-redirect=uk 328
 
0.2%
https://nyti.ms/2gutbqb 282
 
0.2%
https://www.youtube.com/watch?v=cuxw9re-z-e&feature=youtu.be 194
 
0.1%
http://english.news 190
 
0.1%
https://www.bbc.co.uk/news/world-asia-39912671 185
 
0.1%
https://www.lavoceditrieste.net/2015/10/31/does-rome-want-to-eliminate-the-european-singapore 177
 
0.1%
https://sc.mp/2rcpf5x 176
 
0.1%
https://www.reuters.com/?edition-redirect=in 176
 
0.1%
https://www.lavoceditrieste.net/2015/10/31/roma-vuole-eliminare-la-singapore-deuropa 170
 
0.1%
https://edition.cnn.com/2017/05/13/asia/china-belt-and-road-forum-xi-putin-erdogan/index.html 162
 
0.1%
Other values (95500) 149646
98.7%

Most occurring characters

ValueCountFrequency (%)
t 830242
 
7.9%
/ 631408
 
6.0%
e 537152
 
5.1%
i 440320
 
4.2%
o 424284
 
4.0%
s 422727
 
4.0%
- 418155
 
4.0%
a 411726
 
3.9%
n 353806
 
3.4%
c 323647
 
3.1%
Other values (82) 5694301
54.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6875107
65.6%
Other Punctuation 1228186
 
11.7%
Decimal Number 1046914
 
10.0%
Uppercase Letter 707794
 
6.7%
Dash Punctuation 418155
 
4.0%
Math Symbol 136392
 
1.3%
Connector Punctuation 75041
 
0.7%
Open Punctuation 87
 
< 0.1%
Close Punctuation 85
 
< 0.1%
Currency Symbol 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 830242
 
12.1%
e 537152
 
7.8%
i 440320
 
6.4%
o 424284
 
6.2%
s 422727
 
6.1%
a 411726
 
6.0%
n 353806
 
5.1%
c 323647
 
4.7%
r 321209
 
4.7%
h 318879
 
4.6%
Other values (17) 2491115
36.2%
Uppercase Letter
ValueCountFrequency (%)
N 38900
 
5.5%
S 38296
 
5.4%
W 37600
 
5.3%
A 33851
 
4.8%
F 32353
 
4.6%
Y 32148
 
4.5%
R 32079
 
4.5%
C 31597
 
4.5%
Z 31460
 
4.4%
X 31184
 
4.4%
Other values (16) 368326
52.0%
Other Punctuation
ValueCountFrequency (%)
/ 631408
51.4%
. 282357
23.0%
: 162784
 
13.3%
& 94795
 
7.7%
? 38465
 
3.1%
% 16946
 
1.4%
# 838
 
0.1%
, 152
 
< 0.1%
! 134
 
< 0.1%
' 107
 
< 0.1%
Other values (5) 200
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 164762
15.7%
1 157311
15.0%
0 153878
14.7%
3 93127
8.9%
8 88322
8.4%
5 81760
7.8%
9 79643
7.6%
7 78387
7.5%
6 75027
7.2%
4 74697
7.1%
Math Symbol
ValueCountFrequency (%)
= 133503
97.9%
+ 2677
 
2.0%
| 191
 
0.1%
~ 21
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 48
55.2%
( 35
40.2%
{ 4
 
4.6%
Close Punctuation
ValueCountFrequency (%)
] 46
54.1%
) 35
41.2%
} 4
 
4.7%
Dash Punctuation
ValueCountFrequency (%)
- 418155
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 75041
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 6
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7582901
72.3%
Common 2904867
 
27.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 830242
 
10.9%
e 537152
 
7.1%
i 440320
 
5.8%
o 424284
 
5.6%
s 422727
 
5.6%
a 411726
 
5.4%
n 353806
 
4.7%
c 323647
 
4.3%
r 321209
 
4.2%
h 318879
 
4.2%
Other values (43) 3198909
42.2%
Common
ValueCountFrequency (%)
/ 631408
21.7%
- 418155
14.4%
. 282357
9.7%
2 164762
 
5.7%
: 162784
 
5.6%
1 157311
 
5.4%
0 153878
 
5.3%
= 133503
 
4.6%
& 94795
 
3.3%
3 93127
 
3.2%
Other values (29) 612787
21.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10487767
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 830242
 
7.9%
/ 631408
 
6.0%
e 537152
 
5.1%
i 440320
 
4.2%
o 424284
 
4.0%
s 422727
 
4.0%
- 418155
 
4.0%
a 411726
 
3.9%
n 353806
 
3.4%
c 323647
 
3.1%
Other values (81) 5694300
54.3%
None
ValueCountFrequency (%)
ł 1
100.0%

hashtag
Categorical

HIGH CARDINALITY  MISSING 

Distinct93371
Distinct (%)49.4%
Missing311862
Missing (%)62.3%
Memory size25.5 MiB
BeltandRoad
 
14910
China
 
3760
beltandroad
 
1538
China BeltandRoad
 
1397
OBOR
 
1201
Other values (93366)
166042 

Length

Max length257
Median length237
Mean length29.637402
Min length1

Characters and Unicode

Total characters5596964
Distinct characters1909
Distinct categories13 ?
Distinct scripts24 ?
Distinct blocks28 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80452 ?
Unique (%)42.6%

Sample

1st rowChina
2nd rowAsia energy NewSilkRoad
3rd rowchina asia energy
4th rowChina
5th rowMongolia

Common Values

ValueCountFrequency (%)
BeltandRoad 14910
 
3.0%
China 3760
 
0.8%
beltandroad 1538
 
0.3%
China BeltandRoad 1397
 
0.3%
OBOR 1201
 
0.2%
BRI 1194
 
0.2%
KuşakveYol BeltandRoad 1107
 
0.2%
BeltAndRoad 1074
 
0.2%
OneBeltOneRoad 886
 
0.2%
NewSilkRoad 667
 
0.1%
Other values (93361) 161114
32.2%
(Missing) 311862
62.3%

Length

2023-04-06T22:43:34.503738image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
beltandroad 84501
 
14.1%
china 53618
 
8.9%
bri 18497
 
3.1%
obor 16567
 
2.8%
onebeltoneroad 11846
 
2.0%
silkroad 6758
 
1.1%
cpec 5759
 
1.0%
pakistan 5214
 
0.9%
newsilkroad 5040
 
0.8%
beltandroadinitiative 4397
 
0.7%
Other values (50836) 388464
64.7%

Most occurring characters

ValueCountFrequency (%)
a 586167
 
10.5%
n 421044
 
7.5%
411813
 
7.4%
e 396264
 
7.1%
i 364066
 
6.5%
o 322897
 
5.8%
t 280890
 
5.0%
d 276723
 
4.9%
l 241180
 
4.3%
r 212981
 
3.8%
Other values (1899) 2082939
37.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4084340
73.0%
Uppercase Letter 1016468
 
18.2%
Space Separator 411813
 
7.4%
Decimal Number 44518
 
0.8%
Other Letter 31881
 
0.6%
Connector Punctuation 5795
 
0.1%
Nonspacing Mark 1409
 
< 0.1%
Modifier Letter 288
 
< 0.1%
Spacing Mark 232
 
< 0.1%
Other Punctuation 165
 
< 0.1%
Other values (3) 55
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ا 2215
 
6.9%
2022
 
6.3%
1031
 
3.2%
ر 901
 
2.8%
ل 871
 
2.7%
ي 764
 
2.4%
و 705
 
2.2%
ن 664
 
2.1%
س 611
 
1.9%
ت 594
 
1.9%
Other values (1527) 21503
67.4%
Lowercase Letter
ValueCountFrequency (%)
a 586167
14.4%
n 421044
10.3%
e 396264
9.7%
i 364066
8.9%
o 322897
7.9%
t 280890
 
6.9%
d 276723
 
6.8%
l 241180
 
5.9%
r 212981
 
5.2%
s 176505
 
4.3%
Other values (152) 805623
19.7%
Uppercase Letter
ValueCountFrequency (%)
B 159416
15.7%
R 157365
15.5%
C 121061
11.9%
O 84752
 
8.3%
I 56868
 
5.6%
A 55246
 
5.4%
S 49985
 
4.9%
T 40873
 
4.0%
P 40526
 
4.0%
E 35502
 
3.5%
Other values (88) 214874
21.1%
Nonspacing Mark
ValueCountFrequency (%)
209
14.8%
191
13.6%
171
12.1%
124
8.8%
122
8.7%
91
 
6.5%
67
 
4.8%
60
 
4.3%
53
 
3.8%
30
 
2.1%
Other values (43) 291
20.7%
Spacing Mark
ValueCountFrequency (%)
43
18.5%
27
11.6%
26
11.2%
22
9.5%
22
9.5%
ि 21
9.1%
17
 
7.3%
16
 
6.9%
5
 
2.2%
ி 5
 
2.2%
Other values (20) 28
12.1%
Decimal Number
ValueCountFrequency (%)
1 10888
24.5%
2 8727
19.6%
0 8280
18.6%
9 6352
14.3%
7 2395
 
5.4%
5 2147
 
4.8%
3 1990
 
4.5%
8 1763
 
4.0%
4 1123
 
2.5%
6 842
 
1.9%
Other values (7) 11
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
' 106
64.2%
, 30
 
18.2%
19
 
11.5%
\ 7
 
4.2%
· 3
 
1.8%
Modifier Letter
ValueCountFrequency (%)
287
99.7%
1
 
0.3%
Space Separator
ValueCountFrequency (%)
411813
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5795
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 23
100.0%
Close Punctuation
ValueCountFrequency (%)
] 23
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5093212
91.0%
Common 462686
 
8.3%
Arabic 12112
 
0.2%
Han 11830
 
0.2%
Cyrillic 7027
 
0.1%
Thai 4913
 
0.1%
Katakana 2522
 
< 0.1%
Myanmar 691
 
< 0.1%
Hangul 543
 
< 0.1%
Greek 486
 
< 0.1%
Other values (14) 942
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
2022
 
17.1%
1031
 
8.7%
585
 
4.9%
472
 
4.0%
376
 
3.2%
276
 
2.3%
222
 
1.9%
150
 
1.3%
144
 
1.2%
143
 
1.2%
Other values (1049) 6409
54.2%
Latin
ValueCountFrequency (%)
a 586167
 
11.5%
n 421044
 
8.3%
e 396264
 
7.8%
i 364066
 
7.1%
o 322897
 
6.3%
t 280890
 
5.5%
d 276723
 
5.4%
l 241180
 
4.7%
r 212981
 
4.2%
s 176505
 
3.5%
Other values (112) 1814495
35.6%
Hangul
ValueCountFrequency (%)
40
 
7.4%
30
 
5.5%
28
 
5.2%
24
 
4.4%
21
 
3.9%
19
 
3.5%
19
 
3.5%
19
 
3.5%
18
 
3.3%
17
 
3.1%
Other values (94) 308
56.7%
Katakana
ValueCountFrequency (%)
275
 
10.9%
166
 
6.6%
165
 
6.5%
162
 
6.4%
142
 
5.6%
137
 
5.4%
123
 
4.9%
123
 
4.9%
111
 
4.4%
78
 
3.1%
Other values (65) 1040
41.2%
Cyrillic
ValueCountFrequency (%)
и 816
 
11.6%
а 646
 
9.2%
н 530
 
7.5%
е 392
 
5.6%
с 381
 
5.4%
т 364
 
5.2%
р 345
 
4.9%
ь 338
 
4.8%
к 322
 
4.6%
л 302
 
4.3%
Other values (52) 2591
36.9%
Thai
ValueCountFrequency (%)
349
 
7.1%
333
 
6.8%
277
 
5.6%
231
 
4.7%
209
 
4.3%
202
 
4.1%
193
 
3.9%
191
 
3.9%
179
 
3.6%
171
 
3.5%
Other values (50) 2578
52.5%
Arabic
ValueCountFrequency (%)
ا 2215
18.3%
ر 901
 
7.4%
ل 871
 
7.2%
ي 764
 
6.3%
و 705
 
5.8%
ن 664
 
5.5%
س 611
 
5.0%
ت 594
 
4.9%
م 480
 
4.0%
ی 393
 
3.2%
Other values (48) 3914
32.3%
Hiragana
ValueCountFrequency (%)
25
 
11.5%
10
 
4.6%
9
 
4.1%
9
 
4.1%
9
 
4.1%
8
 
3.7%
8
 
3.7%
8
 
3.7%
8
 
3.7%
8
 
3.7%
Other values (39) 116
53.2%
Greek
ValueCountFrequency (%)
α 44
 
9.1%
ν 35
 
7.2%
ο 31
 
6.4%
ι 25
 
5.1%
η 23
 
4.7%
τ 22
 
4.5%
ρ 21
 
4.3%
ε 20
 
4.1%
κ 19
 
3.9%
Ε 18
 
3.7%
Other values (38) 228
46.9%
Devanagari
ValueCountFrequency (%)
43
 
9.3%
32
 
6.9%
28
 
6.0%
27
 
5.8%
22
 
4.7%
ि 21
 
4.5%
19
 
4.1%
18
 
3.9%
18
 
3.9%
17
 
3.7%
Other values (37) 219
47.2%
Myanmar
ValueCountFrequency (%)
91
 
13.2%
60
 
8.7%
48
 
6.9%
46
 
6.7%
36
 
5.2%
28
 
4.1%
26
 
3.8%
က 26
 
3.8%
26
 
3.8%
25
 
3.6%
Other values (36) 279
40.4%
Common
ValueCountFrequency (%)
411813
89.0%
1 10888
 
2.4%
2 8727
 
1.9%
0 8280
 
1.8%
9 6352
 
1.4%
_ 5795
 
1.3%
7 2395
 
0.5%
5 2147
 
0.5%
3 1990
 
0.4%
8 1763
 
0.4%
Other values (26) 2536
 
0.5%
Sinhala
ValueCountFrequency (%)
4
 
10.3%
3
 
7.7%
3
 
7.7%
3
 
7.7%
3
 
7.7%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
Other values (15) 15
38.5%
Tamil
ValueCountFrequency (%)
5
 
8.1%
ி 5
 
8.1%
5
 
8.1%
5
 
8.1%
4
 
6.5%
4
 
6.5%
3
 
4.8%
3
 
4.8%
3
 
4.8%
3
 
4.8%
Other values (13) 22
35.5%
Kannada
ValueCountFrequency (%)
6
18.2%
4
 
12.1%
3
 
9.1%
3
 
9.1%
2
 
6.1%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
Other values (10) 10
30.3%
Hebrew
ValueCountFrequency (%)
י 12
26.1%
ס 6
13.0%
ן 6
13.0%
א 5
10.9%
ה 4
 
8.7%
ר 3
 
6.5%
פ 2
 
4.3%
ו 2
 
4.3%
ב 1
 
2.2%
ק 1
 
2.2%
Other values (4) 4
 
8.7%
Lao
ValueCountFrequency (%)
3
18.8%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (4) 4
25.0%
Bengali
ValueCountFrequency (%)
3
17.6%
2
11.8%
2
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Other values (3) 3
17.6%
Georgian
ValueCountFrequency (%)
3
18.8%
2
12.5%
2
12.5%
2
12.5%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Ethiopic
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Oriya
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Inherited
ValueCountFrequency (%)
َ 4
44.4%
ِ 2
22.2%
1
 
11.1%
ُ 1
 
11.1%
ٍ 1
 
11.1%
Armenian
ValueCountFrequency (%)
Ե 1
33.3%
Պ 1
33.3%
Հ 1
33.3%
Bopomofo
ValueCountFrequency (%)
3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5548321
99.1%
Arabic 12120
 
0.2%
CJK 11828
 
0.2%
None 7694
 
0.1%
Cyrillic 7027
 
0.1%
Thai 4913
 
0.1%
Katakana 2822
 
0.1%
Myanmar 691
 
< 0.1%
Hangul 543
 
< 0.1%
Devanagari 464
 
< 0.1%
Other values (18) 541
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 586167
 
10.6%
n 421044
 
7.6%
411813
 
7.4%
e 396264
 
7.1%
i 364066
 
6.6%
o 322897
 
5.8%
t 280890
 
5.1%
d 276723
 
5.0%
l 241180
 
4.3%
r 212981
 
3.8%
Other values (60) 2034296
36.7%
None
ValueCountFrequency (%)
ş 2529
32.9%
Ç 1061
13.8%
ü 739
 
9.6%
ß 360
 
4.7%
ç 279
 
3.6%
ž 252
 
3.3%
ó 235
 
3.1%
ö 203
 
2.6%
é 168
 
2.2%
ł 163
 
2.1%
Other values (113) 1705
22.2%
Arabic
ValueCountFrequency (%)
ا 2215
18.3%
ر 901
 
7.4%
ل 871
 
7.2%
ي 764
 
6.3%
و 705
 
5.8%
ن 664
 
5.5%
س 611
 
5.0%
ت 594
 
4.9%
م 480
 
4.0%
ی 393
 
3.2%
Other values (52) 3922
32.4%
CJK
ValueCountFrequency (%)
2022
 
17.1%
1031
 
8.7%
585
 
4.9%
472
 
4.0%
376
 
3.2%
276
 
2.3%
222
 
1.9%
150
 
1.3%
144
 
1.2%
143
 
1.2%
Other values (1047) 6407
54.2%
Cyrillic
ValueCountFrequency (%)
и 816
 
11.6%
а 646
 
9.2%
н 530
 
7.5%
е 392
 
5.6%
с 381
 
5.4%
т 364
 
5.2%
р 345
 
4.9%
ь 338
 
4.8%
к 322
 
4.6%
л 302
 
4.3%
Other values (52) 2591
36.9%
Thai
ValueCountFrequency (%)
349
 
7.1%
333
 
6.8%
277
 
5.6%
231
 
4.7%
209
 
4.3%
202
 
4.1%
193
 
3.9%
191
 
3.9%
179
 
3.6%
171
 
3.5%
Other values (50) 2578
52.5%
Katakana
ValueCountFrequency (%)
287
 
10.2%
275
 
9.7%
166
 
5.9%
165
 
5.8%
162
 
5.7%
142
 
5.0%
137
 
4.9%
123
 
4.4%
123
 
4.4%
111
 
3.9%
Other values (61) 1131
40.1%
Myanmar
ValueCountFrequency (%)
91
 
13.2%
60
 
8.7%
48
 
6.9%
46
 
6.7%
36
 
5.2%
28
 
4.1%
26
 
3.8%
က 26
 
3.8%
26
 
3.8%
25
 
3.6%
Other values (36) 279
40.4%
Letterlike Symbols
ValueCountFrequency (%)
45
100.0%
Devanagari
ValueCountFrequency (%)
43
 
9.3%
32
 
6.9%
28
 
6.0%
27
 
5.8%
22
 
4.7%
ि 21
 
4.5%
19
 
4.1%
18
 
3.9%
18
 
3.9%
17
 
3.7%
Other values (37) 219
47.2%
Hangul
ValueCountFrequency (%)
40
 
7.4%
30
 
5.5%
28
 
5.2%
24
 
4.4%
21
 
3.9%
19
 
3.5%
19
 
3.5%
19
 
3.5%
18
 
3.3%
17
 
3.1%
Other values (94) 308
56.7%
Hiragana
ValueCountFrequency (%)
25
 
11.5%
10
 
4.6%
9
 
4.1%
9
 
4.1%
9
 
4.1%
8
 
3.7%
8
 
3.7%
8
 
3.7%
8
 
3.7%
8
 
3.7%
Other values (39) 116
53.2%
Hebrew
ValueCountFrequency (%)
י 12
26.1%
ס 6
13.0%
ן 6
13.0%
א 5
10.9%
ה 4
 
8.7%
ר 3
 
6.5%
פ 2
 
4.3%
ו 2
 
4.3%
ב 1
 
2.2%
ק 1
 
2.2%
Other values (4) 4
 
8.7%
Kannada
ValueCountFrequency (%)
6
18.2%
4
 
12.1%
3
 
9.1%
3
 
9.1%
2
 
6.1%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
Other values (10) 10
30.3%
Tamil
ValueCountFrequency (%)
5
 
8.1%
ி 5
 
8.1%
5
 
8.1%
5
 
8.1%
4
 
6.5%
4
 
6.5%
3
 
4.8%
3
 
4.8%
3
 
4.8%
3
 
4.8%
Other values (13) 22
35.5%
Math Alphanum
ValueCountFrequency (%)
𝗶 4
21.1%
𝗻 3
15.8%
𝗫 1
 
5.3%
𝗝 1
 
5.3%
𝗽 1
 
5.3%
𝗴 1
 
5.3%
𝗣 1
 
5.3%
𝘂 1
 
5.3%
𝘁 1
 
5.3%
𝒪 1
 
5.3%
Other values (4) 4
21.1%
Sinhala
ValueCountFrequency (%)
4
 
10.3%
3
 
7.7%
3
 
7.7%
3
 
7.7%
3
 
7.7%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
Other values (15) 15
38.5%
IPA Ext
ValueCountFrequency (%)
ə 4
100.0%
Lao
ValueCountFrequency (%)
3
18.8%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (4) 4
25.0%
Bopomofo
ValueCountFrequency (%)
3
100.0%
Bengali
ValueCountFrequency (%)
3
17.6%
2
11.8%
2
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Other values (3) 3
17.6%
Georgian
ValueCountFrequency (%)
3
18.8%
2
12.5%
2
12.5%
2
12.5%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Oriya
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Latin Ext Additional
ValueCountFrequency (%)
1
50.0%
1
50.0%
Ethiopic
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
VS
ValueCountFrequency (%)
1
100.0%
Armenian
ValueCountFrequency (%)
Ե 1
33.3%
Պ 1
33.3%
Հ 1
33.3%
CJK Ext A
ValueCountFrequency (%)
1
100.0%

hashtag_lang
Categorical

HIGH CARDINALITY  MISSING 

Distinct2344
Distinct (%)1.2%
Missing311885
Missing (%)62.3%
Memory size20.7 MiB
en 71
17489 
en 50
 
6022
en 66
 
4466
en 58
 
4344
en 55
 
4285
Other values (2339)
152219 

Length

Max length6
Median length5
Mean length5.0013558
Min length4

Characters and Unicode

Total characters944381
Distinct characters36
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique746 ?
Unique (%)0.4%

Sample

1st rowen 50
2nd rowen 48
3rd rowen 41
4th rowen 50
5th rowen 18

Common Values

ValueCountFrequency (%)
en 71 17489
 
3.5%
en 50 6022
 
1.2%
en 66 4466
 
0.9%
en 58 4344
 
0.9%
en 55 4285
 
0.9%
en 69 4124
 
0.8%
en 53 3553
 
0.7%
en 56 3529
 
0.7%
en 60 3386
 
0.7%
en 54 3259
 
0.7%
Other values (2334) 134368
26.8%
(Missing) 311885
62.3%

Length

2023-04-06T22:43:34.659774image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
en 163936
43.4%
71 17601
 
4.7%
50 6221
 
1.6%
66 5774
 
1.5%
id 5025
 
1.3%
55 4518
 
1.2%
58 4483
 
1.2%
69 4294
 
1.1%
de 3903
 
1.0%
53 3824
 
1.0%
Other values (197) 158071
41.9%

Most occurring characters

ValueCountFrequency (%)
188825
20.0%
e 170314
18.0%
n 164622
17.4%
5 55725
 
5.9%
6 54764
 
5.8%
7 51740
 
5.5%
4 40144
 
4.3%
1 37593
 
4.0%
3 37413
 
4.0%
2 30096
 
3.2%
Other values (26) 113145
12.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 377961
40.0%
Decimal Number 377595
40.0%
Space Separator 188825
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 170314
45.1%
n 164622
43.6%
d 9001
 
2.4%
i 7896
 
2.1%
t 4621
 
1.2%
r 3884
 
1.0%
s 3484
 
0.9%
a 2182
 
0.6%
f 1987
 
0.5%
h 1558
 
0.4%
Other values (15) 8412
 
2.2%
Decimal Number
ValueCountFrequency (%)
5 55725
14.8%
6 54764
14.5%
7 51740
13.7%
4 40144
10.6%
1 37593
10.0%
3 37413
9.9%
2 30096
8.0%
8 27985
7.4%
9 22037
 
5.8%
0 20098
 
5.3%
Space Separator
ValueCountFrequency (%)
188825
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 566420
60.0%
Latin 377961
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 170314
45.1%
n 164622
43.6%
d 9001
 
2.4%
i 7896
 
2.1%
t 4621
 
1.2%
r 3884
 
1.0%
s 3484
 
0.9%
a 2182
 
0.6%
f 1987
 
0.5%
h 1558
 
0.4%
Other values (15) 8412
 
2.2%
Common
ValueCountFrequency (%)
188825
33.3%
5 55725
 
9.8%
6 54764
 
9.7%
7 51740
 
9.1%
4 40144
 
7.1%
1 37593
 
6.6%
3 37413
 
6.6%
2 30096
 
5.3%
8 27985
 
4.9%
9 22037
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 944381
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
188825
20.0%
e 170314
18.0%
n 164622
17.4%
5 55725
 
5.9%
6 54764
 
5.8%
7 51740
 
5.5%
4 40144
 
4.3%
1 37593
 
4.0%
3 37413
 
4.0%
2 30096
 
3.2%
Other values (26) 113145
12.0%

hashtag_en
Categorical

HIGH CARDINALITY  MISSING 

Distinct92987
Distinct (%)49.2%
Missing311885
Missing (%)62.3%
Memory size25.4 MiB
BeltandRoad
 
14956
China
 
3778
beltandroad
 
1538
China BeltandRoad
 
1402
OBOR
 
1201
Other values (92982)
165950 

Length

Max length5099
Median length594
Mean length30.050322
Min length1

Characters and Unicode

Total characters5674252
Distinct characters890
Distinct categories18 ?
Distinct scripts21 ?
Distinct blocks26 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80085 ?
Unique (%)42.4%

Sample

1st rowChina
2nd rowAsia energy NewSilkRoad
3rd rowchina asia energy
4th rowChina
5th rowMongolia

Common Values

ValueCountFrequency (%)
BeltandRoad 14956
 
3.0%
China 3778
 
0.8%
beltandroad 1538
 
0.3%
China BeltandRoad 1402
 
0.3%
OBOR 1201
 
0.2%
BRI 1195
 
0.2%
KuşakveYol BeltandRoad 1107
 
0.2%
BeltAndRoad 1082
 
0.2%
OneBeltOneRoad 917
 
0.2%
NewSilkRoad 691
 
0.1%
Other values (92977) 160958
32.1%
(Missing) 311885
62.3%

Length

2023-04-06T22:43:34.828039image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
beltandroad 82719
 
13.2%
china 55986
 
8.9%
bri 18445
 
2.9%
obor 17002
 
2.7%
onebeltoneroad 11640
 
1.9%
silkroad 6748
 
1.1%
cpec 5723
 
0.9%
pakistan 5216
 
0.8%
hongkong 5213
 
0.8%
newsilkroad 4983
 
0.8%
Other values (50601) 414901
66.0%

Most occurring characters

ValueCountFrequency (%)
a 591458
 
10.4%
439791
 
7.8%
n 431498
 
7.6%
e 398004
 
7.0%
i 365768
 
6.4%
o 332025
 
5.9%
t 288126
 
5.1%
d 278753
 
4.9%
l 242522
 
4.3%
r 213832
 
3.8%
Other values (880) 2092475
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4135385
72.9%
Uppercase Letter 1030503
 
18.2%
Space Separator 439791
 
7.8%
Decimal Number 44723
 
0.8%
Other Letter 8341
 
0.1%
Other Punctuation 5808
 
0.1%
Connector Punctuation 5378
 
0.1%
Dash Punctuation 3032
 
0.1%
Nonspacing Mark 378
 
< 0.1%
Other Symbol 321
 
< 0.1%
Other values (8) 592
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
930
 
11.1%
ا 692
 
8.3%
467
 
5.6%
429
 
5.1%
ر 351
 
4.2%
ی 337
 
4.0%
و 282
 
3.4%
ن 258
 
3.1%
ت 229
 
2.7%
ک 208
 
2.5%
Other values (547) 4158
49.9%
Lowercase Letter
ValueCountFrequency (%)
a 591458
14.3%
n 431498
10.4%
e 398004
9.6%
i 365768
8.8%
o 332025
8.0%
t 288126
 
7.0%
d 278753
 
6.7%
l 242522
 
5.9%
r 213832
 
5.2%
s 179782
 
4.3%
Other values (127) 813617
19.7%
Uppercase Letter
ValueCountFrequency (%)
B 160055
15.5%
R 159624
15.5%
C 124697
12.1%
O 83919
 
8.1%
I 58554
 
5.7%
A 56795
 
5.5%
S 50779
 
4.9%
T 42527
 
4.1%
P 40083
 
3.9%
E 36762
 
3.6%
Other values (68) 216708
21.0%
Nonspacing Mark
ValueCountFrequency (%)
91
24.1%
60
15.9%
28
 
7.4%
20
 
5.3%
18
 
4.8%
18
 
4.8%
15
 
4.0%
12
 
3.2%
10
 
2.6%
9
 
2.4%
Other values (31) 97
25.7%
Spacing Mark
ValueCountFrequency (%)
26
19.4%
22
16.4%
17
12.7%
16
11.9%
8
 
6.0%
ி 5
 
3.7%
5
 
3.7%
4
 
3.0%
4
 
3.0%
ि 3
 
2.2%
Other values (17) 24
17.9%
Other Punctuation
ValueCountFrequency (%)
, 2733
47.1%
. 1834
31.6%
' 1010
 
17.4%
\ 57
 
1.0%
? 36
 
0.6%
" 28
 
0.5%
: 24
 
0.4%
@ 23
 
0.4%
· 20
 
0.3%
! 12
 
0.2%
Other values (6) 31
 
0.5%
Decimal Number
ValueCountFrequency (%)
1 11191
25.0%
2 8658
19.4%
0 8285
18.5%
9 6314
14.1%
7 2418
 
5.4%
5 2135
 
4.8%
3 1978
 
4.4%
8 1760
 
3.9%
4 1134
 
2.5%
6 841
 
1.9%
Other values (5) 9
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
312
97.2%
8
 
2.5%
1
 
0.3%
Math Symbol
ValueCountFrequency (%)
= 90
97.8%
< 1
 
1.1%
> 1
 
1.1%
Close Punctuation
ValueCountFrequency (%)
) 55
49.1%
} 50
44.6%
] 7
 
6.2%
Open Punctuation
ValueCountFrequency (%)
{ 50
63.3%
( 24
30.4%
[ 5
 
6.3%
Space Separator
ValueCountFrequency (%)
439791
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5378
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3032
100.0%
Modifier Letter
ValueCountFrequency (%)
158
100.0%
Final Punctuation
ValueCountFrequency (%)
8
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 6
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5165164
91.0%
Common 499566
 
8.8%
Arabic 4001
 
0.1%
Han 2968
 
0.1%
Myanmar 691
 
< 0.1%
Katakana 664
 
< 0.1%
Greek 479
 
< 0.1%
Thai 212
 
< 0.1%
Cyrillic 181
 
< 0.1%
Devanagari 84
 
< 0.1%
Other values (11) 242
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
930
31.3%
467
15.7%
429
14.5%
62
 
2.1%
52
 
1.8%
51
 
1.7%
51
 
1.7%
51
 
1.7%
50
 
1.7%
50
 
1.7%
Other values (298) 775
26.1%
Latin
ValueCountFrequency (%)
a 591458
 
11.5%
n 431498
 
8.4%
e 398004
 
7.7%
i 365768
 
7.1%
o 332025
 
6.4%
t 288126
 
5.6%
d 278753
 
5.4%
l 242522
 
4.7%
r 213832
 
4.1%
s 179782
 
3.5%
Other values (103) 1843396
35.7%
Common
ValueCountFrequency (%)
439791
88.0%
1 11191
 
2.2%
2 8658
 
1.7%
0 8285
 
1.7%
9 6314
 
1.3%
_ 5378
 
1.1%
- 3032
 
0.6%
, 2733
 
0.5%
7 2418
 
0.5%
5 2135
 
0.4%
Other values (50) 9631
 
1.9%
Arabic
ValueCountFrequency (%)
ا 692
17.3%
ر 351
 
8.8%
ی 337
 
8.4%
و 282
 
7.0%
ن 258
 
6.4%
ت 229
 
5.7%
ک 208
 
5.2%
م 175
 
4.4%
س 160
 
4.0%
د 158
 
3.9%
Other values (43) 1151
28.8%
Greek
ValueCountFrequency (%)
α 44
 
9.2%
ν 35
 
7.3%
ο 31
 
6.5%
ι 25
 
5.2%
η 23
 
4.8%
τ 22
 
4.6%
ρ 21
 
4.4%
ε 20
 
4.2%
κ 19
 
4.0%
ί 18
 
3.8%
Other values (38) 221
46.1%
Myanmar
ValueCountFrequency (%)
91
 
13.2%
60
 
8.7%
48
 
6.9%
46
 
6.7%
36
 
5.2%
28
 
4.1%
26
 
3.8%
က 26
 
3.8%
26
 
3.8%
25
 
3.6%
Other values (36) 279
40.4%
Thai
ValueCountFrequency (%)
14
 
6.6%
14
 
6.6%
13
 
6.1%
12
 
5.7%
10
 
4.7%
10
 
4.7%
9
 
4.2%
8
 
3.8%
8
 
3.8%
7
 
3.3%
Other values (36) 107
50.5%
Cyrillic
ValueCountFrequency (%)
а 22
 
12.2%
м 13
 
7.2%
и 12
 
6.6%
н 12
 
6.6%
ш 8
 
4.4%
р 8
 
4.4%
с 7
 
3.9%
т 7
 
3.9%
е 7
 
3.9%
ы 7
 
3.9%
Other values (29) 78
43.1%
Devanagari
ValueCountFrequency (%)
8
 
9.5%
7
 
8.3%
6
 
7.1%
6
 
7.1%
5
 
6.0%
4
 
4.8%
4
 
4.8%
3
 
3.6%
3
 
3.6%
ि 3
 
3.6%
Other values (21) 35
41.7%
Sinhala
ValueCountFrequency (%)
4
 
10.3%
3
 
7.7%
3
 
7.7%
3
 
7.7%
3
 
7.7%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
Other values (15) 15
38.5%
Katakana
ValueCountFrequency (%)
100
15.1%
51
7.7%
51
7.7%
51
7.7%
50
7.5%
50
7.5%
50
7.5%
50
7.5%
50
7.5%
50
7.5%
Other values (13) 111
16.7%
Tamil
ValueCountFrequency (%)
5
 
8.1%
ி 5
 
8.1%
5
 
8.1%
5
 
8.1%
4
 
6.5%
4
 
6.5%
3
 
4.8%
3
 
4.8%
3
 
4.8%
3
 
4.8%
Other values (13) 22
35.5%
Kannada
ValueCountFrequency (%)
6
18.2%
4
 
12.1%
3
 
9.1%
3
 
9.1%
2
 
6.1%
1
 
3.0%
ಿ 1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
Other values (10) 10
30.3%
Hebrew
ValueCountFrequency (%)
י 12
26.1%
ס 6
13.0%
ן 6
13.0%
א 5
10.9%
ה 4
 
8.7%
ר 3
 
6.5%
ו 2
 
4.3%
פ 2
 
4.3%
ק 1
 
2.2%
ח 1
 
2.2%
Other values (4) 4
 
8.7%
Hiragana
ValueCountFrequency (%)
3
20.0%
3
20.0%
2
13.3%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Ethiopic
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Oriya
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Bengali
ValueCountFrequency (%)
2
22.2%
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Hangul
ValueCountFrequency (%)
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%
1
8.3%
1
8.3%
Inherited
ValueCountFrequency (%)
َ 4
57.1%
ِ 2
28.6%
ٍ 1
 
14.3%
Bopomofo
ValueCountFrequency (%)
3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5661889
99.8%
Arabic 4008
 
0.1%
CJK 2968
 
0.1%
None 2773
 
< 0.1%
Katakana 816
 
< 0.1%
Myanmar 691
 
< 0.1%
Misc Symbols 312
 
< 0.1%
Thai 212
 
< 0.1%
Cyrillic 181
 
< 0.1%
Devanagari 84
 
< 0.1%
Other values (16) 318
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 591458
 
10.4%
439791
 
7.8%
n 431498
 
7.6%
e 398004
 
7.0%
i 365768
 
6.5%
o 332025
 
5.9%
t 288126
 
5.1%
d 278753
 
4.9%
l 242522
 
4.3%
r 213832
 
3.8%
Other values (80) 2080112
36.7%
None
ValueCountFrequency (%)
ş 1299
46.8%
ž 252
 
9.1%
ö 155
 
5.6%
ß 65
 
2.3%
í 50
 
1.8%
ü 50
 
1.8%
İ 45
 
1.6%
α 44
 
1.6%
é 44
 
1.6%
ν 35
 
1.3%
Other values (105) 734
26.5%
CJK
ValueCountFrequency (%)
930
31.3%
467
15.7%
429
14.5%
62
 
2.1%
52
 
1.8%
51
 
1.7%
51
 
1.7%
51
 
1.7%
50
 
1.7%
50
 
1.7%
Other values (298) 775
26.1%
Arabic
ValueCountFrequency (%)
ا 692
17.3%
ر 351
 
8.8%
ی 337
 
8.4%
و 282
 
7.0%
ن 258
 
6.4%
ت 229
 
5.7%
ک 208
 
5.2%
م 175
 
4.4%
س 160
 
4.0%
د 158
 
3.9%
Other values (46) 1158
28.9%
Misc Symbols
ValueCountFrequency (%)
312
100.0%
Katakana
ValueCountFrequency (%)
158
19.4%
100
12.3%
51
 
6.2%
51
 
6.2%
51
 
6.2%
50
 
6.1%
50
 
6.1%
50
 
6.1%
50
 
6.1%
50
 
6.1%
Other values (8) 155
19.0%
Myanmar
ValueCountFrequency (%)
91
 
13.2%
60
 
8.7%
48
 
6.9%
46
 
6.7%
36
 
5.2%
28
 
4.1%
26
 
3.8%
က 26
 
3.8%
26
 
3.8%
25
 
3.6%
Other values (36) 279
40.4%
Letterlike Symbols
ValueCountFrequency (%)
45
100.0%
Cyrillic
ValueCountFrequency (%)
а 22
 
12.2%
м 13
 
7.2%
и 12
 
6.6%
н 12
 
6.6%
ш 8
 
4.4%
р 8
 
4.4%
с 7
 
3.9%
т 7
 
3.9%
е 7
 
3.9%
ы 7
 
3.9%
Other values (29) 78
43.1%
Thai
ValueCountFrequency (%)
14
 
6.6%
14
 
6.6%
13
 
6.1%
12
 
5.7%
10
 
4.7%
10
 
4.7%
9
 
4.2%
8
 
3.8%
8
 
3.8%
7
 
3.3%
Other values (36) 107
50.5%
Hebrew
ValueCountFrequency (%)
י 12
26.1%
ס 6
13.0%
ן 6
13.0%
א 5
10.9%
ה 4
 
8.7%
ר 3
 
6.5%
ו 2
 
4.3%
פ 2
 
4.3%
ק 1
 
2.2%
ח 1
 
2.2%
Other values (4) 4
 
8.7%
Devanagari
ValueCountFrequency (%)
8
 
9.5%
7
 
8.3%
6
 
7.1%
6
 
7.1%
5
 
6.0%
4
 
4.8%
4
 
4.8%
3
 
3.6%
3
 
3.6%
ि 3
 
3.6%
Other values (21) 35
41.7%
Punctuation
ValueCountFrequency (%)
8
88.9%
1
 
11.1%
Geometric Shapes
ValueCountFrequency (%)
8
100.0%
Kannada
ValueCountFrequency (%)
6
18.2%
4
 
12.1%
3
 
9.1%
3
 
9.1%
2
 
6.1%
1
 
3.0%
ಿ 1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
Other values (10) 10
30.3%
Tamil
ValueCountFrequency (%)
5
 
8.1%
ி 5
 
8.1%
5
 
8.1%
5
 
8.1%
4
 
6.5%
4
 
6.5%
3
 
4.8%
3
 
4.8%
3
 
4.8%
3
 
4.8%
Other values (13) 22
35.5%
Sinhala
ValueCountFrequency (%)
4
 
10.3%
3
 
7.7%
3
 
7.7%
3
 
7.7%
3
 
7.7%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
Other values (15) 15
38.5%
Math Alphanum
ValueCountFrequency (%)
𝗶 4
21.1%
𝗻 3
15.8%
𝗽 1
 
5.3%
𝒪 1
 
5.3%
𝒷 1
 
5.3%
𝑒 1
 
5.3%
𝒾 1
 
5.3%
𝒟 1
 
5.3%
𝗴 1
 
5.3%
𝗝 1
 
5.3%
Other values (4) 4
21.1%
Bopomofo
ValueCountFrequency (%)
3
100.0%
Hiragana
ValueCountFrequency (%)
3
20.0%
3
20.0%
2
13.3%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Oriya
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Bengali
ValueCountFrequency (%)
2
22.2%
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Hangul
ValueCountFrequency (%)
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%
1
8.3%
1
8.3%
IPA Ext
ValueCountFrequency (%)
ə 1
100.0%
Ethiopic
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Block Elements
ValueCountFrequency (%)
1
100.0%

cashtag
Categorical

HIGH CARDINALITY  MISSING 

Distinct415
Distinct (%)33.7%
Missing499478
Missing (%)99.8%
Memory size15.3 MiB
MAN
 
82
HG_F
 
74
v
 
73
ZKIN
 
42
OBOR
 
41
Other values (410)
920 

Length

Max length142
Median length107
Mean length7.2678571
Min length1

Characters and Unicode

Total characters8954
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique281 ?
Unique (%)22.8%

Sample

1st rowAZIA
2nd rowMRVL
3rd rowMRVL
4th rowAAPL
5th rowSWX

Common Values

ValueCountFrequency (%)
MAN 82
 
< 0.1%
HG_F 74
 
< 0.1%
v 73
 
< 0.1%
ZKIN 42
 
< 0.1%
OBOR 41
 
< 0.1%
VET 41
 
< 0.1%
btc eth ltc xrp vet 30
 
< 0.1%
YRIV 23
 
< 0.1%
FB TWTR LNKD 20
 
< 0.1%
man 16
 
< 0.1%
Other values (405) 790
 
0.2%
(Missing) 499478
99.8%

Length

2023-04-06T22:43:35.017659image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
vet 133
 
5.7%
man 128
 
5.5%
v 81
 
3.5%
hg_f 74
 
3.2%
btc 64
 
2.7%
obor 54
 
2.3%
eth 51
 
2.2%
zkin 42
 
1.8%
xrp 42
 
1.8%
spy 40
 
1.7%
Other values (466) 1625
69.6%

Most occurring characters

ValueCountFrequency (%)
1102
 
12.3%
N 395
 
4.4%
T 384
 
4.3%
A 381
 
4.3%
I 355
 
4.0%
R 351
 
3.9%
E 346
 
3.9%
S 312
 
3.5%
B 311
 
3.5%
C 308
 
3.4%
Other values (45) 4709
52.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6226
69.5%
Lowercase Letter 1538
 
17.2%
Space Separator 1102
 
12.3%
Connector Punctuation 75
 
0.8%
Other Punctuation 13
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 395
 
6.3%
T 384
 
6.2%
A 381
 
6.1%
I 355
 
5.7%
R 351
 
5.6%
E 346
 
5.6%
S 312
 
5.0%
B 311
 
5.0%
C 308
 
4.9%
H 277
 
4.4%
Other values (16) 2806
45.1%
Lowercase Letter
ValueCountFrequency (%)
t 219
14.2%
v 148
 
9.6%
e 130
 
8.5%
c 111
 
7.2%
a 102
 
6.6%
l 81
 
5.3%
b 79
 
5.1%
s 69
 
4.5%
n 63
 
4.1%
r 62
 
4.0%
Other values (16) 474
30.8%
Space Separator
ValueCountFrequency (%)
1102
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 75
100.0%
Other Punctuation
ValueCountFrequency (%)
. 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7764
86.7%
Common 1190
 
13.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 395
 
5.1%
T 384
 
4.9%
A 381
 
4.9%
I 355
 
4.6%
R 351
 
4.5%
E 346
 
4.5%
S 312
 
4.0%
B 311
 
4.0%
C 308
 
4.0%
H 277
 
3.6%
Other values (42) 4344
56.0%
Common
ValueCountFrequency (%)
1102
92.6%
_ 75
 
6.3%
. 13
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8954
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1102
 
12.3%
N 395
 
4.4%
T 384
 
4.3%
A 381
 
4.3%
I 355
 
4.0%
R 351
 
3.9%
E 346
 
3.9%
S 312
 
3.5%
B 311
 
3.5%
C 308
 
3.4%
Other values (45) 4709
52.6%

media
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct104798
Distinct (%)95.0%
Missing390417
Missing (%)78.0%
Memory size41.7 MiB
[Photo(previewUrl='https://pbs.twimg.com/media/CSpA0IHUYAA_dzA?format=png&name=small', fullUrl='https://pbs.twimg.com/media/CSpA0IHUYAA_dzA?format=png&name=large')]
 
65
[Photo(previewUrl='https://pbs.twimg.com/media/D8UmgdpXUAA7eWz?format=jpg&name=small', fullUrl='https://pbs.twimg.com/media/D8UmgdpXUAA7eWz?format=jpg&name=large')]
 
45
[Video(thumbnailUrl='https://pbs.twimg.com/media/EJ63E2oWsAE7-je.jpg', variants=[VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1184218546018148355/vid/480x270/L-bX98i2bPwh4CiK.mp4?tag=13', bitrate=288000), VideoVariant(contentType='application/x-mpegURL', url='https://video.twimg.com/amplify_video/1184218546018148355/pl/-MDJ4O6Ig5Fnujnk.m3u8?tag=13', bitrate=None), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1184218546018148355/vid/640x360/FDcBzFJfX7jYGnM5.mp4?tag=13', bitrate=832000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1184218546018148355/vid/1280x720/O4PltRA1G43uFMtI.mp4?tag=13', bitrate=2176000)], duration=91.34, views=26789)]
 
38
[Video(thumbnailUrl='https://pbs.twimg.com/ext_tw_video_thumb/1003609615488180226/pu/img/0NXi0mnM1YVFzjzc.jpg', variants=[VideoVariant(contentType='video/mp4', url='https://video.twimg.com/ext_tw_video/1003609615488180226/pu/vid/640x360/T9XoBgDvsYgsXsli.mp4?tag=3', bitrate=832000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/ext_tw_video/1003609615488180226/pu/vid/320x180/ZhikSDLKuEVe9f6-.mp4?tag=3', bitrate=256000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/ext_tw_video/1003609615488180226/pu/vid/1280x720/SIdSejWUIl3KTOaQ.mp4?tag=3', bitrate=2176000), VideoVariant(contentType='application/x-mpegURL', url='https://video.twimg.com/ext_tw_video/1003609615488180226/pu/pl/JIxe16BUWnl2CCfG.m3u8?tag=3', bitrate=None)], duration=89.333, views=95996)]
 
31
[Video(thumbnailUrl='https://pbs.twimg.com/media/EAvlOyGWsAAY4Aw.jpg', variants=[VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1156266749488246784/vid/1168x656/T_4pIO2hNM8jsCEt.mp4?tag=13', bitrate=2176000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1156266749488246784/vid/640x360/diqhFPtuDXOepno2.mp4?tag=13', bitrate=832000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1156266749488246784/vid/480x270/S9VWh7avTPotrazK.mp4?tag=13', bitrate=288000), VideoVariant(contentType='application/x-mpegURL', url='https://video.twimg.com/amplify_video/1156266749488246784/pl/fm-wy5c_iBqe7hXN.m3u8?tag=13', bitrate=None)], duration=21.3, views=41463)]
 
30
Other values (104793)
110084 

Length

Max length1605
Median length164
Mean length226.31366
Min length164

Characters and Unicode

Total characters24960812
Distinct characters77
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique101423 ?
Unique (%)92.0%

Sample

1st row[Photo(previewUrl='https://pbs.twimg.com/media/BTrHn_aCMAAR3kf?format=jpg&name=small', fullUrl='https://pbs.twimg.com/media/BTrHn_aCMAAR3kf?format=jpg&name=large')]
2nd row[Photo(previewUrl='https://pbs.twimg.com/media/BUPwyGFCAAAKWGM?format=jpg&name=small', fullUrl='https://pbs.twimg.com/media/BUPwyGFCAAAKWGM?format=jpg&name=large')]
3rd row[Photo(previewUrl='https://pbs.twimg.com/media/BUR-NnMCAAE-JGQ?format=jpg&name=small', fullUrl='https://pbs.twimg.com/media/BUR-NnMCAAE-JGQ?format=jpg&name=large')]
4th row[Photo(previewUrl='https://pbs.twimg.com/media/BUefpwHCYAAP7pN?format=jpg&name=small', fullUrl='https://pbs.twimg.com/media/BUefpwHCYAAP7pN?format=jpg&name=large')]
5th row[Photo(previewUrl='https://pbs.twimg.com/media/BUnHmU6CcAEEh8Z?format=jpg&name=small', fullUrl='https://pbs.twimg.com/media/BUnHmU6CcAEEh8Z?format=jpg&name=large')]

Common Values

ValueCountFrequency (%)
[Photo(previewUrl='https://pbs.twimg.com/media/CSpA0IHUYAA_dzA?format=png&name=small', fullUrl='https://pbs.twimg.com/media/CSpA0IHUYAA_dzA?format=png&name=large')] 65
 
< 0.1%
[Photo(previewUrl='https://pbs.twimg.com/media/D8UmgdpXUAA7eWz?format=jpg&name=small', fullUrl='https://pbs.twimg.com/media/D8UmgdpXUAA7eWz?format=jpg&name=large')] 45
 
< 0.1%
[Video(thumbnailUrl='https://pbs.twimg.com/media/EJ63E2oWsAE7-je.jpg', variants=[VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1184218546018148355/vid/480x270/L-bX98i2bPwh4CiK.mp4?tag=13', bitrate=288000), VideoVariant(contentType='application/x-mpegURL', url='https://video.twimg.com/amplify_video/1184218546018148355/pl/-MDJ4O6Ig5Fnujnk.m3u8?tag=13', bitrate=None), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1184218546018148355/vid/640x360/FDcBzFJfX7jYGnM5.mp4?tag=13', bitrate=832000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1184218546018148355/vid/1280x720/O4PltRA1G43uFMtI.mp4?tag=13', bitrate=2176000)], duration=91.34, views=26789)] 38
 
< 0.1%
[Video(thumbnailUrl='https://pbs.twimg.com/ext_tw_video_thumb/1003609615488180226/pu/img/0NXi0mnM1YVFzjzc.jpg', variants=[VideoVariant(contentType='video/mp4', url='https://video.twimg.com/ext_tw_video/1003609615488180226/pu/vid/640x360/T9XoBgDvsYgsXsli.mp4?tag=3', bitrate=832000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/ext_tw_video/1003609615488180226/pu/vid/320x180/ZhikSDLKuEVe9f6-.mp4?tag=3', bitrate=256000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/ext_tw_video/1003609615488180226/pu/vid/1280x720/SIdSejWUIl3KTOaQ.mp4?tag=3', bitrate=2176000), VideoVariant(contentType='application/x-mpegURL', url='https://video.twimg.com/ext_tw_video/1003609615488180226/pu/pl/JIxe16BUWnl2CCfG.m3u8?tag=3', bitrate=None)], duration=89.333, views=95996)] 31
 
< 0.1%
[Video(thumbnailUrl='https://pbs.twimg.com/media/EAvlOyGWsAAY4Aw.jpg', variants=[VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1156266749488246784/vid/1168x656/T_4pIO2hNM8jsCEt.mp4?tag=13', bitrate=2176000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1156266749488246784/vid/640x360/diqhFPtuDXOepno2.mp4?tag=13', bitrate=832000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1156266749488246784/vid/480x270/S9VWh7avTPotrazK.mp4?tag=13', bitrate=288000), VideoVariant(contentType='application/x-mpegURL', url='https://video.twimg.com/amplify_video/1156266749488246784/pl/fm-wy5c_iBqe7hXN.m3u8?tag=13', bitrate=None)], duration=21.3, views=41463)] 30
 
< 0.1%
[Photo(previewUrl='https://pbs.twimg.com/media/Clhyx46UgAQ4kxq?format=jpg&name=small', fullUrl='https://pbs.twimg.com/media/Clhyx46UgAQ4kxq?format=jpg&name=large')] 26
 
< 0.1%
[Photo(previewUrl='https://pbs.twimg.com/media/DJF4bYKUMAAaFHc?format=jpg&name=small', fullUrl='https://pbs.twimg.com/media/DJF4bYKUMAAaFHc?format=jpg&name=large')] 24
 
< 0.1%
[Video(thumbnailUrl='https://pbs.twimg.com/media/D8Z9SQ2XkAACelN.jpg', variants=[VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1136733439720337408/vid/640x360/oXlcm9mPn3ILwVpF.mp4?tag=13', bitrate=832000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1136733439720337408/vid/480x270/-CXyBmUcw1_MND8W.mp4?tag=13', bitrate=288000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1136733439720337408/vid/1280x720/yTSHxuDrasbSM92H.mp4?tag=13', bitrate=2176000), VideoVariant(contentType='application/x-mpegURL', url='https://video.twimg.com/amplify_video/1136733439720337408/pl/iPJLyN2uhNG1THwx.m3u8?tag=13', bitrate=None)], duration=92.14, views=24300)] 24
 
< 0.1%
[Video(thumbnailUrl='https://pbs.twimg.com/media/D40bercUYAA9-d-.jpg', variants=[VideoVariant(contentType='application/x-mpegURL', url='https://video.twimg.com/amplify_video/1120581640512606208/pl/xhGe2nJ6_-0cTojL.m3u8?tag=11', bitrate=None), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1120581640512606208/vid/320x180/N8SF_arpZmDVwPdA.mp4?tag=11', bitrate=288000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/amplify_video/1120581640512606208/vid/640x360/84-tXOQAcXJVsG0Q.mp4?tag=11', bitrate=832000)], duration=389.12, views=22715)] 21
 
< 0.1%
[Photo(previewUrl='https://pbs.twimg.com/media/B5ANn_iIgAAdMIw?format=jpg&name=small', fullUrl='https://pbs.twimg.com/media/B5ANn_iIgAAdMIw?format=jpg&name=large')] 20
 
< 0.1%
Other values (104788) 109969
 
22.0%
(Missing) 390417
78.0%

Length

2023-04-06T22:43:35.206598image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
videovariant(contenttype='video/mp4 11374
 
3.4%
bitrate=none 5641
 
1.7%
bitrate=832000 5300
 
1.6%
variants=[videovariant(contenttype='video/mp4 4995
 
1.5%
videovariant(contenttype='application/x-mpegurl 4137
 
1.2%
bitrate=2176000 4042
 
1.2%
bitrate=256000 2960
 
0.9%
variants=[videovariant(contenttype='application/x-mpegurl 1504
 
0.4%
bitrate=288000 1419
 
0.4%
bitrate=0 858
 
0.3%
Other values (269781) 292601
87.4%

Most occurring characters

ValueCountFrequency (%)
m 1555076
 
6.2%
t 1452814
 
5.8%
/ 1223179
 
4.9%
a 1179841
 
4.7%
e 1120571
 
4.5%
p 1068385
 
4.3%
l 980944
 
3.9%
o 975644
 
3.9%
i 907766
 
3.6%
r 883673
 
3.5%
Other values (67) 13612919
54.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15629193
62.6%
Other Punctuation 3443147
 
13.8%
Uppercase Letter 2758705
 
11.1%
Decimal Number 1336692
 
5.4%
Math Symbol 860945
 
3.4%
Close Punctuation 270811
 
1.1%
Open Punctuation 270811
 
1.1%
Space Separator 224538
 
0.9%
Connector Punctuation 113291
 
0.5%
Dash Punctuation 52679
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 1555076
 
9.9%
t 1452814
 
9.3%
a 1179841
 
7.5%
e 1120571
 
7.2%
p 1068385
 
6.8%
l 980944
 
6.3%
o 975644
 
6.2%
i 907766
 
5.8%
r 883673
 
5.7%
g 746556
 
4.8%
Other values (16) 4757923
30.4%
Uppercase Letter
ValueCountFrequency (%)
A 465035
16.9%
U 403327
14.6%
E 183461
 
6.7%
D 175417
 
6.4%
P 169761
 
6.2%
V 137472
 
5.0%
X 117600
 
4.3%
W 111486
 
4.0%
C 98474
 
3.6%
I 76173
 
2.8%
Other values (16) 820499
29.7%
Decimal Number
ValueCountFrequency (%)
0 205364
15.4%
4 169476
12.7%
1 149097
11.2%
2 138252
10.3%
8 131725
9.9%
3 122459
9.2%
6 118462
8.9%
5 105880
7.9%
7 101750
7.6%
9 94227
7.0%
Other Punctuation
ValueCountFrequency (%)
/ 1223179
35.5%
' 603078
17.5%
. 593208
17.2%
: 279529
 
8.1%
? 268126
 
7.8%
& 251489
 
7.3%
, 224538
 
6.5%
Close Punctuation
ValueCountFrequency (%)
) 154019
56.9%
] 116792
43.1%
Open Punctuation
ValueCountFrequency (%)
( 154019
56.9%
[ 116792
43.1%
Math Symbol
ValueCountFrequency (%)
= 860945
100.0%
Space Separator
ValueCountFrequency (%)
224538
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 113291
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 52679
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18387898
73.7%
Common 6572914
 
26.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
m 1555076
 
8.5%
t 1452814
 
7.9%
a 1179841
 
6.4%
e 1120571
 
6.1%
p 1068385
 
5.8%
l 980944
 
5.3%
o 975644
 
5.3%
i 907766
 
4.9%
r 883673
 
4.8%
g 746556
 
4.1%
Other values (42) 7516628
40.9%
Common
ValueCountFrequency (%)
/ 1223179
18.6%
= 860945
13.1%
' 603078
 
9.2%
. 593208
 
9.0%
: 279529
 
4.3%
? 268126
 
4.1%
& 251489
 
3.8%
, 224538
 
3.4%
224538
 
3.4%
0 205364
 
3.1%
Other values (15) 1838920
28.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24960812
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m 1555076
 
6.2%
t 1452814
 
5.8%
/ 1223179
 
4.9%
a 1179841
 
4.7%
e 1120571
 
4.5%
p 1068385
 
4.3%
l 980944
 
3.9%
o 975644
 
3.9%
i 907766
 
3.6%
r 883673
 
3.5%
Other values (67) 13612919
54.5%
Distinct99196
Distinct (%)95.6%
Missing396933
Missing (%)79.3%
Memory size25.6 MiB
https://pbs.twimg.com/media/CSpA0IHUYAA_dzA?format=png&name=large
 
65
https://pbs.twimg.com/media/D8UmgdpXUAA7eWz?format=jpg&name=large
 
45
https://pbs.twimg.com/media/Clhyx46UgAQ4kxq?format=jpg&name=large
 
26
https://pbs.twimg.com/media/DJF4bYKUMAAaFHc?format=jpg&name=large
 
24
https://pbs.twimg.com/media/B5ANn_iIgAAdMIw?format=jpg&name=large
 
20
Other values (99191)
103597 
(Missing)
396933 
ValueCountFrequency (%)
https://pbs.twimg.com/media/CSpA0IHUYAA_dzA?format=png&name=large 65
 
< 0.1%
https://pbs.twimg.com/media/D8UmgdpXUAA7eWz?format=jpg&name=large 45
 
< 0.1%
https://pbs.twimg.com/media/Clhyx46UgAQ4kxq?format=jpg&name=large 26
 
< 0.1%
https://pbs.twimg.com/media/DJF4bYKUMAAaFHc?format=jpg&name=large 24
 
< 0.1%
https://pbs.twimg.com/media/B5ANn_iIgAAdMIw?format=jpg&name=large 20
 
< 0.1%
https://pbs.twimg.com/media/DmGZINgUYAUeixn?format=jpg&name=large 19
 
< 0.1%
https://pbs.twimg.com/media/DOlbpuUX0AEQmYk?format=jpg&name=large 15
 
< 0.1%
https://pbs.twimg.com/media/DG9OtPrWsAAAlBk?format=jpg&name=large 13
 
< 0.1%
https://pbs.twimg.com/media/DECeGFwW0AAr_XT?format=jpg&name=large 13
 
< 0.1%
https://pbs.twimg.com/media/DF4mJalWAAQOTjw?format=jpg&name=large 13
 
< 0.1%
Other values (99186) 103524
 
20.7%
(Missing) 396933
79.3%
ValueCountFrequency (%)
https 103777
 
20.7%
(Missing) 396933
79.3%
ValueCountFrequency (%)
pbs.twimg.com 103777
 
20.7%
(Missing) 396933
79.3%
ValueCountFrequency (%)
/media/CSpA0IHUYAA_dzA 65
 
< 0.1%
/media/D8UmgdpXUAA7eWz 45
 
< 0.1%
/media/Clhyx46UgAQ4kxq 26
 
< 0.1%
/media/DJF4bYKUMAAaFHc 24
 
< 0.1%
/media/B5ANn_iIgAAdMIw 20
 
< 0.1%
/media/DmGZINgUYAUeixn 19
 
< 0.1%
/media/DOlbpuUX0AEQmYk 15
 
< 0.1%
/media/DECeGFwW0AAr_XT 13
 
< 0.1%
/media/DFkpUm-VwAAy5hx 13
 
< 0.1%
/media/DDPgqODXoAE9whg 13
 
< 0.1%
Other values (99185) 103524
 
20.7%
(Missing) 396933
79.3%
ValueCountFrequency (%)
format=jpg&name=large 87779
 
17.5%
format=png&name=large 4912
 
1.0%
format=jpg&name=large https://pbs.twimg.com/media/C_xoUKVVwAAcDr7?format=jpg&name=large https://pbs.twimg.com/media/C_xoUKXUMAAw5l3?format=jpg&name=large 10
 
< 0.1%
format=jpg&name=large https://pbs.twimg.com/media/DLRTxEuUEAAYV3p?format=jpg&name=large https://pbs.twimg.com/media/DLRTxalUIAAUBZ-?format=jpg&name=large 9
 
< 0.1%
format=jpg&name=large https://pbs.twimg.com/media/DLRbaM_UEAAjX1a?format=jpg&name=large 9
 
< 0.1%
format=jpg&name=large https://pbs.twimg.com/media/C_ytNgqXkAAdVDO?format=jpg&name=large 9
 
< 0.1%
format=jpg&name=large https://pbs.twimg.com/media/Bu7tHVtIUAE4u55?format=jpg&name=large 8
 
< 0.1%
format=jpg&name=large https://pbs.twimg.com/media/Da-0hj5VMAA0ztg?format=jpg&name=large https://pbs.twimg.com/media/Da-0jlHVAAAi9Ie?format=jpg&name=large https://pbs.twimg.com/media/Da-0ltsU8AIZuco?format=jpg&name=large 7
 
< 0.1%
format=jpg&name=large https://pbs.twimg.com/media/Cwj0DaRXEAA71Um?format=jpg&name=large 6
 
< 0.1%
format=jpg&name=large https://pbs.twimg.com/media/D416neXWAAAW6bo?format=jpg&name=large https://pbs.twimg.com/media/D416neVWwAA1yOK?format=jpg&name=large https://pbs.twimg.com/media/D416ngMW4AEV1Gc?format=jpg&name=large 6
 
< 0.1%
Other values (10661) 11022
 
2.2%
(Missing) 396933
79.3%
ValueCountFrequency (%)
103777
 
20.7%
(Missing) 396933
79.3%
Distinct4744
Distinct (%)84.2%
Missing495073
Missing (%)98.9%
Memory size15.9 MiB
https://video.twimg.com/amplify_video/1184218546018148355/vid/1280x720/O4PltRA1G43uFMtI.mp4?tag=13
 
38
https://video.twimg.com/ext_tw_video/1003609615488180226/pu/vid/1280x720/SIdSejWUIl3KTOaQ.mp4?tag=3
 
31
https://video.twimg.com/amplify_video/1156266749488246784/vid/1168x656/T_4pIO2hNM8jsCEt.mp4?tag=13
 
30
https://video.twimg.com/amplify_video/1136733439720337408/vid/1280x720/yTSHxuDrasbSM92H.mp4?tag=13
 
24
https://video.twimg.com/amplify_video/1120581640512606208/vid/640x360/84-tXOQAcXJVsG0Q.mp4?tag=11
 
21
Other values (4739)
 
5493
(Missing)
495073 
ValueCountFrequency (%)
https://video.twimg.com/amplify_video/1184218546018148355/vid/1280x720/O4PltRA1G43uFMtI.mp4?tag=13 38
 
< 0.1%
https://video.twimg.com/ext_tw_video/1003609615488180226/pu/vid/1280x720/SIdSejWUIl3KTOaQ.mp4?tag=3 31
 
< 0.1%
https://video.twimg.com/amplify_video/1156266749488246784/vid/1168x656/T_4pIO2hNM8jsCEt.mp4?tag=13 30
 
< 0.1%
https://video.twimg.com/amplify_video/1136733439720337408/vid/1280x720/yTSHxuDrasbSM92H.mp4?tag=13 24
 
< 0.1%
https://video.twimg.com/amplify_video/1120581640512606208/vid/640x360/84-tXOQAcXJVsG0Q.mp4?tag=11 21
 
< 0.1%
https://video.twimg.com/amplify_video/1009755756718100480/vid/1280x720/h_qCyWfNIn9lkKRg.mp4?tag=2 14
 
< 0.1%
https://video.twimg.com/ext_tw_video/826411685716119553/pu/vid/720x720/CY6ZuhWHvvH5-LF7.mp4 14
 
< 0.1%
https://video.twimg.com/amplify_video/1103613439702908933/vid/1280x720/jS1WuEbm7noMgzDQ.mp4?tag=9 11
 
< 0.1%
https://video.twimg.com/amplify_video/1121616545199742976/vid/1280x720/jee5ElylvJ4z8znL.mp4?tag=11 11
 
< 0.1%
https://video.twimg.com/ext_tw_video/981500388368158721/pu/vid/720x720/Db4h3kCuBZpGw5me.mp4?tag=2 10
 
< 0.1%
Other values (4734) 5433
 
1.1%
(Missing) 495073
98.9%
ValueCountFrequency (%)
https 5637
 
1.1%
(Missing) 495073
98.9%
ValueCountFrequency (%)
video.twimg.com 5637
 
1.1%
(Missing) 495073
98.9%
ValueCountFrequency (%)
/amplify_video/1184218546018148355/vid/1280x720/O4PltRA1G43uFMtI.mp4 38
 
< 0.1%
/ext_tw_video/1003609615488180226/pu/vid/1280x720/SIdSejWUIl3KTOaQ.mp4 31
 
< 0.1%
/amplify_video/1156266749488246784/vid/1168x656/T_4pIO2hNM8jsCEt.mp4 30
 
< 0.1%
/amplify_video/1136733439720337408/vid/1280x720/yTSHxuDrasbSM92H.mp4 24
 
< 0.1%
/amplify_video/1120581640512606208/vid/640x360/84-tXOQAcXJVsG0Q.mp4 21
 
< 0.1%
/ext_tw_video/826411685716119553/pu/vid/720x720/CY6ZuhWHvvH5-LF7.mp4 14
 
< 0.1%
/amplify_video/1009755756718100480/vid/1280x720/h_qCyWfNIn9lkKRg.mp4 14
 
< 0.1%
/amplify_video/1103613439702908933/vid/1280x720/jS1WuEbm7noMgzDQ.mp4 11
 
< 0.1%
/amplify_video/1121616545199742976/vid/1280x720/jee5ElylvJ4z8znL.mp4 11
 
< 0.1%
/ext_tw_video/981500388368158721/pu/vid/720x720/Db4h3kCuBZpGw5me.mp4 10
 
< 0.1%
Other values (4734) 5433
 
1.1%
(Missing) 495073
98.9%
ValueCountFrequency (%)
1120
 
0.2%
tag=10 1024
 
0.2%
tag=12 734
 
0.1%
tag=8 665
 
0.1%
tag=13 507
 
0.1%
tag=11 396
 
0.1%
tag=5 264
 
0.1%
tag=9 254
 
0.1%
tag=14 201
 
< 0.1%
tag=3 192
 
< 0.1%
Other values (4) 280
 
0.1%
(Missing) 495073
98.9%
ValueCountFrequency (%)
5637
 
1.1%
(Missing) 495073
98.9%
Distinct837
Distinct (%)97.6%
Missing499852
Missing (%)99.8%
Memory size15.3 MiB
https://video.twimg.com/tweet_video/D--tSrjX4AEGStU.mp4
 
10
https://video.twimg.com/tweet_video/DqRPm_FXcAAXX-p.mp4
 
3
https://video.twimg.com/tweet_video/Dc1ma9VW4AAp1SY.mp4
 
3
https://video.twimg.com/tweet_video/CsVQWuDWYAAZN9_.mp4
 
3
https://video.twimg.com/tweet_video/DNmyBgjVoAAowjv.mp4
 
2
Other values (832)
 
837
(Missing)
499852 
ValueCountFrequency (%)
https://video.twimg.com/tweet_video/D--tSrjX4AEGStU.mp4 10
 
< 0.1%
https://video.twimg.com/tweet_video/DqRPm_FXcAAXX-p.mp4 3
 
< 0.1%
https://video.twimg.com/tweet_video/Dc1ma9VW4AAp1SY.mp4 3
 
< 0.1%
https://video.twimg.com/tweet_video/CsVQWuDWYAAZN9_.mp4 3
 
< 0.1%
https://video.twimg.com/tweet_video/DNmyBgjVoAAowjv.mp4 2
 
< 0.1%
https://video.twimg.com/tweet_video/DQsnr-dXkAAoz_b.mp4 2
 
< 0.1%
https://video.twimg.com/tweet_video/C_YGhKPXUAEvDmh.mp4 2
 
< 0.1%
https://video.twimg.com/tweet_video/Dkd5VGAX0AIsqIQ.mp4 2
 
< 0.1%
https://video.twimg.com/tweet_video/D40scJvUYAIg6CR.mp4 2
 
< 0.1%
https://video.twimg.com/tweet_video/Cite_WPUkAEXgal.mp4 2
 
< 0.1%
Other values (827) 827
 
0.2%
(Missing) 499852
99.8%
ValueCountFrequency (%)
https 858
 
0.2%
(Missing) 499852
99.8%
ValueCountFrequency (%)
video.twimg.com 858
 
0.2%
(Missing) 499852
99.8%
ValueCountFrequency (%)
/tweet_video/D--tSrjX4AEGStU.mp4 10
 
< 0.1%
/tweet_video/DqRPm_FXcAAXX-p.mp4 3
 
< 0.1%
/tweet_video/Dc1ma9VW4AAp1SY.mp4 3
 
< 0.1%
/tweet_video/CsVQWuDWYAAZN9_.mp4 3
 
< 0.1%
/tweet_video/DNmyBgjVoAAowjv.mp4 2
 
< 0.1%
/tweet_video/DQsnr-dXkAAoz_b.mp4 2
 
< 0.1%
/tweet_video/C_YGhKPXUAEvDmh.mp4 2
 
< 0.1%
/tweet_video/Dkd5VGAX0AIsqIQ.mp4 2
 
< 0.1%
/tweet_video/D40scJvUYAIg6CR.mp4 2
 
< 0.1%
/tweet_video/Cite_WPUkAEXgal.mp4 2
 
< 0.1%
Other values (827) 827
 
0.2%
(Missing) 499852
99.8%
ValueCountFrequency (%)
858
 
0.2%
(Missing) 499852
99.8%
ValueCountFrequency (%)
858
 
0.2%
(Missing) 499852
99.8%

likes
Real number (ℝ)

SKEWED  ZEROS 

Distinct921
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.9468355
Minimum0
Maximum23288
Zeros314331
Zeros (%)62.8%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2023-04-06T22:43:35.371649image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile13
Maximum23288
Range23288
Interquartile range (IQR)1

Descriptive statistics

Standard deviation71.188337
Coefficient of variation (CV)14.390682
Kurtosis38111.048
Mean4.9468355
Median Absolute Deviation (MAD)0
Skewness143.3288
Sum2476930
Variance5067.7794
MonotonicityNot monotonic
2023-04-06T22:43:35.514889image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 314331
62.8%
1 71118
 
14.2%
2 30286
 
6.0%
3 17210
 
3.4%
4 11075
 
2.2%
5 7884
 
1.6%
6 5999
 
1.2%
7 4641
 
0.9%
8 3802
 
0.8%
9 2926
 
0.6%
Other values (911) 31438
 
6.3%
ValueCountFrequency (%)
0 314331
62.8%
1 71118
 
14.2%
2 30286
 
6.0%
3 17210
 
3.4%
4 11075
 
2.2%
5 7884
 
1.6%
6 5999
 
1.2%
7 4641
 
0.9%
8 3802
 
0.8%
9 2926
 
0.6%
ValueCountFrequency (%)
23288 1
< 0.1%
20490 1
< 0.1%
7020 1
< 0.1%
7012 1
< 0.1%
5868 1
< 0.1%
5631 1
< 0.1%
5456 1
< 0.1%
5284 1
< 0.1%
5233 1
< 0.1%
5214 1
< 0.1%

retweets
Real number (ℝ)

SKEWED  ZEROS 

Distinct450
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9416009
Minimum0
Maximum7505
Zeros359328
Zeros (%)71.8%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2023-04-06T22:43:35.658123image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile7
Maximum7505
Range7505
Interquartile range (IQR)1

Descriptive statistics

Standard deviation23.135042
Coefficient of variation (CV)11.915446
Kurtosis45718.138
Mean1.9416009
Median Absolute Deviation (MAD)0
Skewness167.99774
Sum972179
Variance535.23016
MonotonicityNot monotonic
2023-04-06T22:43:35.801549image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 359328
71.8%
1 56898
 
11.4%
2 24696
 
4.9%
3 13963
 
2.8%
4 8765
 
1.8%
5 6157
 
1.2%
6 4627
 
0.9%
7 3425
 
0.7%
8 2688
 
0.5%
9 2198
 
0.4%
Other values (440) 17965
 
3.6%
ValueCountFrequency (%)
0 359328
71.8%
1 56898
 
11.4%
2 24696
 
4.9%
3 13963
 
2.8%
4 8765
 
1.8%
5 6157
 
1.2%
6 4627
 
0.9%
7 3425
 
0.7%
8 2688
 
0.5%
9 2198
 
0.4%
ValueCountFrequency (%)
7505 1
< 0.1%
6871 1
< 0.1%
5576 1
< 0.1%
2866 1
< 0.1%
2457 1
< 0.1%
2015 1
< 0.1%
1783 1
< 0.1%
1747 1
< 0.1%
1461 1
< 0.1%
1453 1
< 0.1%

replies
Real number (ℝ)

SKEWED  ZEROS 

Distinct175
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.39902738
Minimum0
Maximum4975
Zeros428227
Zeros (%)85.5%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2023-04-06T22:43:35.952863image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4975
Range4975
Interquartile range (IQR)0

Descriptive statistics

Standard deviation8.7456224
Coefficient of variation (CV)21.917349
Kurtosis215275.06
Mean0.39902738
Median Absolute Deviation (MAD)0
Skewness403.27883
Sum199797
Variance76.485912
MonotonicityNot monotonic
2023-04-06T22:43:36.103547image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 428227
85.5%
1 49537
 
9.9%
2 10234
 
2.0%
3 4052
 
0.8%
4 2182
 
0.4%
5 1432
 
0.3%
6 968
 
0.2%
7 691
 
0.1%
8 490
 
0.1%
9 351
 
0.1%
Other values (165) 2546
 
0.5%
ValueCountFrequency (%)
0 428227
85.5%
1 49537
 
9.9%
2 10234
 
2.0%
3 4052
 
0.8%
4 2182
 
0.4%
5 1432
 
0.3%
6 968
 
0.2%
7 691
 
0.1%
8 490
 
0.1%
9 351
 
0.1%
ValueCountFrequency (%)
4975 1
< 0.1%
1966 1
< 0.1%
1077 1
< 0.1%
891 1
< 0.1%
783 1
< 0.1%
731 1
< 0.1%
638 1
< 0.1%
586 1
< 0.1%
563 1
< 0.1%
459 1
< 0.1%

reply_to_user
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing433060
Missing (%)86.5%
Memory size15.3 MiB

mentioned_users
Categorical

HIGH CARDINALITY  MISSING 

Distinct62698
Distinct (%)41.0%
Missing347742
Missing (%)69.4%
Memory size21.8 MiB
10228272
 
6343
23922797
 
2117
995000000000000000
 
1454
39922594
 
1427
487118986
 
1279
Other values (62693)
140348 

Length

Max length697
Median length691
Mean length19.968392
Min length2

Characters and Unicode

Total characters3054525
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52139 ?
Unique (%)34.1%

Sample

1st row83521919
2nd row1050000000000000000
3rd row1151681138
4th row1050000000000000000
5th row636689752 25159286 28109695 93540572 55725588 1348980000000000000 21425858 477017537

Common Values

ValueCountFrequency (%)
10228272 6343
 
1.3%
23922797 2117
 
0.4%
995000000000000000 1454
 
0.3%
39922594 1427
 
0.3%
487118986 1279
 
0.3%
4898091 1211
 
0.2%
18949452 1149
 
0.2%
1652541 945
 
0.2%
5120691 945
 
0.2%
91478624 909
 
0.2%
Other values (62688) 135189
 
27.0%
(Missing) 347742
69.4%

Length

2023-04-06T22:43:36.277193image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
10228272 6670
 
2.4%
23922797 2389
 
0.9%
141627220 2023
 
0.7%
487118986 1952
 
0.7%
228535666 1932
 
0.7%
995000000000000000 1874
 
0.7%
39922594 1576
 
0.6%
18949452 1489
 
0.5%
4898091 1395
 
0.5%
852000000000000000 1343
 
0.5%
Other values (60267) 250987
91.7%

Most occurring characters

ValueCountFrequency (%)
0 974104
31.9%
1 286663
 
9.4%
2 277271
 
9.1%
3 208661
 
6.8%
8 207345
 
6.8%
9 206142
 
6.7%
4 204800
 
6.7%
7 197967
 
6.5%
6 187548
 
6.1%
5 183362
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2933863
96.0%
Space Separator 120662
 
4.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 974104
33.2%
1 286663
 
9.8%
2 277271
 
9.5%
3 208661
 
7.1%
8 207345
 
7.1%
9 206142
 
7.0%
4 204800
 
7.0%
7 197967
 
6.7%
6 187548
 
6.4%
5 183362
 
6.2%
Space Separator
ValueCountFrequency (%)
120662
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3054525
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 974104
31.9%
1 286663
 
9.4%
2 277271
 
9.1%
3 208661
 
6.8%
8 207345
 
6.8%
9 206142
 
6.7%
4 204800
 
6.7%
7 197967
 
6.5%
6 187548
 
6.1%
5 183362
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3054525
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 974104
31.9%
1 286663
 
9.4%
2 277271
 
9.1%
3 208661
 
6.8%
8 207345
 
6.8%
9 206142
 
6.7%
4 204800
 
6.7%
7 197967
 
6.5%
6 187548
 
6.1%
5 183362
 
6.0%

quoted_tweet
Real number (ℝ)

Distinct26664
Distinct (%)89.3%
Missing470842
Missing (%)94.0%
Infinite0
Infinite (%)0.0%
Mean1.127953 × 1018
Minimum1.2821832 × 1017
Maximum1.4654097 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2023-04-06T22:43:36.434138image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1.2821832 × 1017
5-th percentile8.2930513 × 1017
Q19.7814811 × 1017
median1.1221345 × 1018
Q31.2869978 × 1018
95-th percentile1.4300713 × 1018
Maximum1.4654097 × 1018
Range1.3371914 × 1018
Interquartile range (IQR)3.0884974 × 1017

Descriptive statistics

Standard deviation1.9662638 × 1017
Coefficient of variation (CV)0.17432143
Kurtosis-0.78782612
Mean1.127953 × 1018
Median Absolute Deviation (MAD)1.5599015 × 1017
Skewness-0.17659912
Sum3.3689701 × 1022
Variance3.8661934 × 1034
MonotonicityNot monotonic
2023-04-06T22:43:36.585282image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.464152042 × 1018109
 
< 0.1%
1.382246135 × 101838
 
< 0.1%
1.382329397 × 101824
 
< 0.1%
1.102405637 × 101823
 
< 0.1%
1.217866607 × 101823
 
< 0.1%
1.051216039 × 101822
 
< 0.1%
9.047489028 × 101719
 
< 0.1%
1.19248964 × 101819
 
< 0.1%
1.433843416 × 101815
 
< 0.1%
1.384821549 × 101813
 
< 0.1%
Other values (26654) 29563
 
5.9%
(Missing) 470842
94.0%
ValueCountFrequency (%)
1.282183151 × 10171
< 0.1%
4.33934586 × 10171
< 0.1%
5.121401483 × 10171
< 0.1%
5.457642099 × 10171
< 0.1%
5.482292164 × 10171
< 0.1%
5.632395412 × 10171
< 0.1%
5.65683414 × 10171
< 0.1%
5.755524979 × 10171
< 0.1%
5.812960695 × 10171
< 0.1%
5.857120753 × 10171
< 0.1%
ValueCountFrequency (%)
1.465409744 × 10181
< 0.1%
1.465384647 × 10181
< 0.1%
1.465370739 × 10181
< 0.1%
1.465369374 × 10181
< 0.1%
1.465368831 × 10181
< 0.1%
1.465338616 × 10181
< 0.1%
1.465330744 × 10181
< 0.1%
1.465329894 × 10181
< 0.1%
1.465317604 × 10181
< 0.1%
1.465300854 × 10181
< 0.1%

quoted_by_count
Real number (ℝ)

SKEWED  ZEROS 

Distinct113
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.20444768
Minimum0
Maximum1953
Zeros460723
Zeros (%)92.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2023-04-06T22:43:36.745148image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum1953
Range1953
Interquartile range (IQR)0

Descriptive statistics

Standard deviation5.5307667
Coefficient of variation (CV)27.052234
Kurtosis85322.42
Mean0.20444768
Median Absolute Deviation (MAD)0
Skewness269.45985
Sum102369
Variance30.58938
MonotonicityNot monotonic
2023-04-06T22:43:36.895186image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 460723
92.0%
1 25850
 
5.2%
2 6526
 
1.3%
3 2771
 
0.6%
4 1440
 
0.3%
5 862
 
0.2%
6 528
 
0.1%
7 393
 
0.1%
8 311
 
0.1%
9 233
 
< 0.1%
Other values (103) 1073
 
0.2%
ValueCountFrequency (%)
0 460723
92.0%
1 25850
 
5.2%
2 6526
 
1.3%
3 2771
 
0.6%
4 1440
 
0.3%
5 862
 
0.2%
6 528
 
0.1%
7 393
 
0.1%
8 311
 
0.1%
9 233
 
< 0.1%
ValueCountFrequency (%)
1953 1
< 0.1%
1914 1
< 0.1%
1738 1
< 0.1%
1134 1
< 0.1%
895 1
< 0.1%
847 1
< 0.1%
377 1
< 0.1%
335 1
< 0.1%
327 1
< 0.1%
326 1
< 0.1%

credibility
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing349025
Missing (%)69.7%
Memory size22.0 MiB
1.0
145982 
0.0
 
5703

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters455055
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row0.0
4th row0.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 145982
29.2%
0.0 5703
 
1.1%
(Missing) 349025
69.7%

Length

2023-04-06T22:43:37.294646image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-06T22:43:37.431133image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0 145982
96.2%
0.0 5703
 
3.8%

Most occurring characters

ValueCountFrequency (%)
0 157388
34.6%
. 151685
33.3%
1 145982
32.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 303370
66.7%
Other Punctuation 151685
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 157388
51.9%
1 145982
48.1%
Other Punctuation
ValueCountFrequency (%)
. 151685
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 455055
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 157388
34.6%
. 151685
33.3%
1 145982
32.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 455055
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 157388
34.6%
. 151685
33.3%
1 145982
32.1%

tweet_source
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct3579
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size34.2 MiB
Twitter Web Client
105526 
Twitter for Android
65309 
Twitter Web App
63659 
Twitter for iPhone
60734 
IFTTT
24950 
Other values (3574)
180532 

Length

Max length32
Median length30
Mean length14.489501
Min length1

Characters and Unicode

Total characters7255038
Distinct characters253
Distinct categories15 ?
Distinct scripts10 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1619 ?
Unique (%)0.3%

Sample

1st rowtwitterfeed
2nd rowtwitterfeed
3rd rowTwitter for Websites
4th rowTwitter Web Client
5th rowHootsuite

Common Values

ValueCountFrequency (%)
Twitter Web Client 105526
21.1%
Twitter for Android 65309
13.0%
Twitter Web App 63659
12.7%
Twitter for iPhone 60734
12.1%
IFTTT 24950
 
5.0%
TweetDeck 19819
 
4.0%
dlvr.it 17952
 
3.6%
Buffer 13977
 
2.8%
Facebook 13027
 
2.6%
WordPress.com 10769
 
2.2%
Other values (3569) 104988
21.0%

Length

2023-04-06T22:43:37.554097image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
twitter 310382
26.5%
web 170668
14.6%
for 140332
12.0%
client 105532
 
9.0%
app 67365
 
5.8%
android 66141
 
5.7%
iphone 60755
 
5.2%
ifttt 24950
 
2.1%
tweetdeck 19819
 
1.7%
hootsuite 18196
 
1.6%
Other values (4088) 185625
15.9%

Most occurring characters

ValueCountFrequency (%)
t 860813
11.9%
e 856723
11.8%
669503
 
9.2%
i 652249
 
9.0%
r 615615
 
8.5%
T 414215
 
5.7%
o 413677
 
5.7%
w 357485
 
4.9%
n 276349
 
3.8%
b 199230
 
2.7%
Other values (243) 1939179
26.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5371299
74.0%
Uppercase Letter 1146737
 
15.8%
Space Separator 669599
 
9.2%
Other Punctuation 47277
 
0.7%
Decimal Number 13835
 
0.2%
Other Letter 1706
 
< 0.1%
Dash Punctuation 1465
 
< 0.1%
Connector Punctuation 1444
 
< 0.1%
Close Punctuation 569
 
< 0.1%
Open Punctuation 569
 
< 0.1%
Other values (5) 538
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
185
 
10.8%
110
 
6.4%
93
 
5.5%
93
 
5.5%
93
 
5.5%
81
 
4.7%
81
 
4.7%
77
 
4.5%
稿 77
 
4.5%
77
 
4.5%
Other values (121) 739
43.3%
Lowercase Letter
ValueCountFrequency (%)
t 860813
16.0%
e 856723
16.0%
i 652249
12.1%
r 615615
11.5%
o 413677
7.7%
w 357485
6.7%
n 276349
 
5.1%
b 199230
 
3.7%
d 196616
 
3.7%
f 179979
 
3.4%
Other values (42) 762563
14.2%
Uppercase Letter
ValueCountFrequency (%)
T 414215
36.1%
W 185095
16.1%
A 136894
 
11.9%
C 109663
 
9.6%
P 90600
 
7.9%
F 43546
 
3.8%
I 41707
 
3.6%
D 22178
 
1.9%
S 20445
 
1.8%
B 19922
 
1.7%
Other values (23) 62472
 
5.4%
Other Punctuation
ValueCountFrequency (%)
. 44679
94.5%
: 1375
 
2.9%
! 549
 
1.2%
, 363
 
0.8%
/ 131
 
0.3%
@ 84
 
0.2%
' 58
 
0.1%
& 31
 
0.1%
# 3
 
< 0.1%
2
 
< 0.1%
Other values (2) 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 2148
15.5%
1 1801
13.0%
0 1747
12.6%
3 1568
11.3%
4 1439
10.4%
5 1280
9.3%
6 1093
7.9%
7 972
7.0%
8 922
6.7%
9 865
6.3%
Space Separator
ValueCountFrequency (%)
669503
> 99.9%
  95
 
< 0.1%
  1
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
® 209
97.2%
© 5
 
2.3%
🤖 1
 
0.5%
Math Symbol
ValueCountFrequency (%)
| 94
97.9%
+ 2
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 1465
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1444
100.0%
Close Punctuation
ValueCountFrequency (%)
) 569
100.0%
Open Punctuation
ValueCountFrequency (%)
( 569
100.0%
Modifier Letter
ValueCountFrequency (%)
212
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 13
100.0%
Nonspacing Mark
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6517727
89.8%
Common 735294
 
10.1%
Katakana 774
 
< 0.1%
Han 690
 
< 0.1%
Greek 297
 
< 0.1%
Hiragana 142
 
< 0.1%
Arabic 52
 
< 0.1%
Hangul 28
 
< 0.1%
Thai 22
 
< 0.1%
Cyrillic 12
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 860813
13.2%
e 856723
13.1%
i 652249
 
10.0%
r 615615
 
9.4%
T 414215
 
6.4%
o 413677
 
6.3%
w 357485
 
5.5%
n 276349
 
4.2%
b 199230
 
3.1%
d 196616
 
3.0%
Other values (54) 1674755
25.7%
Han
ValueCountFrequency (%)
81
 
11.7%
81
 
11.7%
77
 
11.2%
稿 77
 
11.2%
77
 
11.2%
14
 
2.0%
14
 
2.0%
14
 
2.0%
14
 
2.0%
11
 
1.6%
Other values (49) 230
33.3%
Common
ValueCountFrequency (%)
669503
91.1%
. 44679
 
6.1%
2 2148
 
0.3%
1 1801
 
0.2%
0 1747
 
0.2%
3 1568
 
0.2%
- 1465
 
0.2%
_ 1444
 
0.2%
4 1439
 
0.2%
: 1375
 
0.2%
Other values (26) 8125
 
1.1%
Katakana
ValueCountFrequency (%)
185
23.9%
110
14.2%
93
12.0%
93
12.0%
93
12.0%
40
 
5.2%
23
 
3.0%
21
 
2.7%
16
 
2.1%
16
 
2.1%
Other values (10) 84
10.9%
Arabic
ValueCountFrequency (%)
ا 10
19.2%
ر 7
13.5%
ن 6
11.5%
ی 6
11.5%
ب 3
 
5.8%
خ 3
 
5.8%
ژ 3
 
5.8%
آ 3
 
5.8%
س 3
 
5.8%
ت 2
 
3.8%
Other values (6) 6
11.5%
Hangul
ValueCountFrequency (%)
6
21.4%
3
10.7%
3
10.7%
2
 
7.1%
2
 
7.1%
2
 
7.1%
1
 
3.6%
1
 
3.6%
1
 
3.6%
1
 
3.6%
Other values (6) 6
21.4%
Greek
ValueCountFrequency (%)
Ο 282
94.9%
ρ 2
 
0.7%
ί 2
 
0.7%
α 2
 
0.7%
Τ 1
 
0.3%
π 1
 
0.3%
ο 1
 
0.3%
λ 1
 
0.3%
η 1
 
0.3%
Α 1
 
0.3%
Other values (3) 3
 
1.0%
Hiragana
ValueCountFrequency (%)
23
16.2%
17
12.0%
15
10.6%
15
10.6%
15
10.6%
15
10.6%
13
9.2%
13
9.2%
8
 
5.6%
4
 
2.8%
Thai
ValueCountFrequency (%)
4
18.2%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
Cyrillic
ValueCountFrequency (%)
О 4
33.3%
о 2
16.7%
в 1
 
8.3%
т 1
 
8.3%
с 1
 
8.3%
Н 1
 
8.3%
и 1
 
8.3%
С 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7252280
> 99.9%
Katakana 988
 
< 0.1%
None 825
 
< 0.1%
CJK 690
 
< 0.1%
Hiragana 142
 
< 0.1%
Arabic 52
 
< 0.1%
Hangul 27
 
< 0.1%
Thai 22
 
< 0.1%
Cyrillic 12
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 860813
11.9%
e 856723
11.8%
669503
 
9.2%
i 652249
 
9.0%
r 615615
 
8.5%
T 414215
 
5.7%
o 413677
 
5.7%
w 357485
 
4.9%
n 276349
 
3.8%
b 199230
 
2.7%
Other values (70) 1936421
26.7%
None
ValueCountFrequency (%)
Ο 282
34.2%
® 209
25.3%
ó 126
15.3%
  95
 
11.5%
ŋ 50
 
6.1%
´ 13
 
1.6%
í 9
 
1.1%
ĭ 6
 
0.7%
© 5
 
0.6%
é 3
 
0.4%
Other values (22) 27
 
3.3%
Katakana
ValueCountFrequency (%)
212
21.5%
185
18.7%
110
11.1%
93
9.4%
93
9.4%
93
9.4%
40
 
4.0%
23
 
2.3%
21
 
2.1%
16
 
1.6%
Other values (12) 102
10.3%
CJK
ValueCountFrequency (%)
81
 
11.7%
81
 
11.7%
77
 
11.2%
稿 77
 
11.2%
77
 
11.2%
14
 
2.0%
14
 
2.0%
14
 
2.0%
14
 
2.0%
11
 
1.6%
Other values (49) 230
33.3%
Hiragana
ValueCountFrequency (%)
23
16.2%
17
12.0%
15
10.6%
15
10.6%
15
10.6%
15
10.6%
13
9.2%
13
9.2%
8
 
5.6%
4
 
2.8%
Arabic
ValueCountFrequency (%)
ا 10
19.2%
ر 7
13.5%
ن 6
11.5%
ی 6
11.5%
ب 3
 
5.8%
خ 3
 
5.8%
ژ 3
 
5.8%
آ 3
 
5.8%
س 3
 
5.8%
ت 2
 
3.8%
Other values (6) 6
11.5%
Hangul
ValueCountFrequency (%)
6
22.2%
3
11.1%
3
11.1%
2
 
7.4%
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (5) 5
18.5%
Thai
ValueCountFrequency (%)
4
18.2%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
Cyrillic
ValueCountFrequency (%)
О 4
33.3%
о 2
16.7%
в 1
 
8.3%
т 1
 
8.3%
с 1
 
8.3%
Н 1
 
8.3%
и 1
 
8.3%
С 1
 
8.3%

Interactions

2023-04-06T22:43:20.669155image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:10.869581image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:12.301298image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:13.909583image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:15.378020image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:16.712297image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:18.134437image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:19.475219image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:20.865564image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:11.074017image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:12.473972image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:14.095832image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:15.554171image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:16.903657image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:18.321061image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:19.610534image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:21.267308image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:11.248782image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:12.654274image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:14.273145image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:15.720318image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:17.076748image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:18.493312image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:19.740207image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:21.470764image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:11.438142image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:13.067958image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:14.465142image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:15.889253image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:17.264920image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:18.686820image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:19.889839image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:21.639592image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:11.603881image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:13.245986image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:14.639125image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:16.048139image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:17.428247image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:18.853604image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:20.016150image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:21.820972image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:11.784052image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:13.420337image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:14.824647image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:16.220862image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:17.606533image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:19.023850image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:20.163803image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:21.956160image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:11.919368image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:13.544434image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:14.964597image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:16.341901image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:17.743753image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:19.148936image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:20.299562image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:22.146367image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:12.110251image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:13.721588image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:15.166290image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:16.534578image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:17.945428image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:19.338209image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-06T22:43:20.444683image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Missing values

2023-04-06T22:43:24.181937image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-06T22:43:26.164296image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-04-06T22:43:31.580740image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

user_idtimestamptweet_idsentiment_polaritytext_lang_fttext_normalizedlinkshashtaghashtag_langhashtag_encashtagmediaimage_urlvideo_urlGIF_urllikesretweetsrepliesreply_to_usermentioned_usersquoted_tweetquoted_by_countcredibilitytweet_source
01166804942013-09-03 02:22:09+00:003747189286828851210.2732en 88['nation', 'agree', 'build', 'new', 'silk', 'road', 'china', 'enhance', 'partnership', 'neighbor', 'west', 'aim']http://bit.ly/17lyTPMNaNNaNNaNNaNNaNNaNNaNNaN000NaNNaNNaN01.0twitterfeed
11725763672013-09-03 02:22:11+00:003747189378894028800.2732en 84['nation', 'agree', 'build', 'new', 'silk', 'road', 'china', 'enhance', 'partnership', 'neighbor', 'west', 'aim']http://bit.ly/17lySv6NaNNaNNaNNaNNaNNaNNaNNaN000NaNNaNNaN01.0twitterfeed
21542262612013-09-03 10:11:50+00:003748371278731755530.0000en 47['high', 'speed', 'rail', 'china', 'new', 'silk', 'road', 'perspective']NaNNaNNaNNaNNaNNaNNaNNaNNaN100NaN83521919NaN0NaNTwitter for Websites
3617336772013-09-03 11:33:26+00:003748576657357045760.2732en 65['nation', 'agree', 'build', 'new', 'silk', 'road']NaNNaNNaNNaNNaNNaNNaNNaNNaN000NaNNaNNaN0NaNTwitter Web Client
4877754222013-09-03 20:10:51+00:003749878767377653760.0000en 56['china', 'kazakhstan', 'tajikistan', 'russia', 'mongolia', 'build', 'new', 'silk', 'road']http://usa.chinadaily.com.cn/epaper/2013-09/03/content_16940556.htmChinaen 50ChinaNaNNaNNaNNaNNaN260NaNNaNNaN00.0Hootsuite
52410243812013-09-03 20:30:54+00:003749929245486858240.5859en 49['xijinpe', 'tour', 'central', 'asia', 'aim', 'boost', 'energy', 'cooperation', 'pare', 'non', 'solo', 'newsilkroad']NaNAsia energy NewSilkRoaden 48Asia energy NewSilkRoadNaNNaNNaNNaNNaN000NaNNaNNaN0NaNTwitter Web Client
68481216422013-09-03 22:26:32+00:003750220232419082240.4939en 40['nation', 'agree', 'build', 'new', 'silk', 'roadbusiness', 'china', 'asia', 'energy']NaNchina asia energyen 41china asia energyNaNNaNNaNNaNNaN000NaN1050000000000000000NaN0NaNtoptoptopics
71956428402013-09-03 23:30:04+00:003750380100702986270.0000en 60['china', 'kazakhstan', 'tajikistan', 'russia', 'mongolia', 'build', 'new', 'silk', 'road']http://usa.chinadaily.com.cn/epaper/2013-09/03/content_16940556.htmChinaen 50ChinaNaNNaNNaNNaNNaN000NaNNaNNaN00.0Twitter for Android
8296178752013-09-04 00:04:47+00:003750467486006845440.3818en 77['mongolia', 'china', 'russia', 'nation', 'build', 'new', 'silk', 'road', 'accelerate', 'economic', 'recovery', 'promote', 'trade']NaNMongoliaen 18MongoliaNaNNaNNaNNaNNaN000NaNNaNNaN0NaNTwitter Web Client
92564784352013-09-04 00:18:01+00:003750500781360701440.2732en 29['nation', 'agree', 'build', 'new', 'silk', 'roadbusiness']NaNNaNNaNNaNNaNNaNNaNNaNNaN000NaNNaNNaN0NaNHootsuite
user_idtimestamptweet_idsentiment_polaritytext_lang_fttext_normalizedlinkshashtaghashtag_langhashtag_encashtagmediaimage_urlvideo_urlGIF_urllikesretweetsrepliesreply_to_usermentioned_usersquoted_tweetquoted_by_countcredibilitytweet_source
500700861974182019-10-09 11:24:36+00:001181893218935459843NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN0011068680000000000000.08628872NaN0NaNTwitter Web App
50070130971654702020-03-30 00:58:59+00:001244428878895960065NaNNaNNaNNaNNaNNaNNaNNaN[Photo(previewUrl='https://pbs.twimg.com/media/EUUZUyvU8AAHVBX?format=png&name=small', fullUrl='https://pbs.twimg.com/media/EUUZUyvU8AAHVBX?format=png&name=large'), Photo(previewUrl='https://pbs.twimg.com/media/EUUZXIiU0AAhYNB?format=png&name=small', fullUrl='https://pbs.twimg.com/media/EUUZXIiU0AAhYNB?format=png&name=large'), Photo(previewUrl='https://pbs.twimg.com/media/EUUZap9UUAEvxUf?format=png&name=small', fullUrl='https://pbs.twimg.com/media/EUUZap9UUAEvxUf?format=png&name=large'), Photo(previewUrl='https://pbs.twimg.com/media/EUUZdJ8UcAArYWU?format=png&name=small', fullUrl='https://pbs.twimg.com/media/EUUZdJ8UcAArYWU?format=png&name=large')]NaNNaNNaN100NaNNaNNaN0NaNTwitter Web App
5007028869466542020-07-11 00:20:55+00:001281745248562151426NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN000NaNNaNNaN0NaNTwitter for iPad
5007039540000000000000002020-07-24 12:14:34+00:001286635884981493761NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN00144123487.044123487 154016912NaN0NaNTwitter Web App
50070410300000000000000002020-10-30 13:55:46+00:001322175364752240642NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN000954000000000000000.0174872895 1180000000000000000 17629860 174872895NaN0NaNTwitter Web App
50070522601285162020-12-14 09:14:43+00:001338412091988992001NaNNaNNaNNaN['OOTT', 'OilPrices', 'OPEC', 'energy', 'COVID19', 'Bitcoin', 'GIEnergyOutlook21']NaNNaNNaN[Video(thumbnailUrl='https://pbs.twimg.com/ext_tw_video_thumb/1338398795126763523/pu/img/MWptwOyFp2EI5RWv.jpg', variants=[VideoVariant(contentType='video/mp4', url='https://video.twimg.com/ext_tw_video/1338398795126763523/pu/vid/658x360/BPyQi0cWMVjEygMv.mp4?tag=10', bitrate=832000), VideoVariant(contentType='application/x-mpegURL', url='https://video.twimg.com/ext_tw_video/1338398795126763523/pu/pl/TuUIlS6iBjnEDO1n.m3u8?tag=10', bitrate=None), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/ext_tw_video/1338398795126763523/pu/vid/1318x720/RiDgDT9TmUCsYyYm.mp4?tag=10', bitrate=2176000), VideoVariant(contentType='video/mp4', url='https://video.twimg.com/ext_tw_video/1338398795126763523/pu/vid/494x270/N4JrGtnxK-YUYPj-.mp4?tag=10', bitrate=256000)], duration=28.228, views=77)]NaNNaNNaN8511029750000000000000.0NaNNaN0NaNTwitter Web App
50070611700000000000000002021-01-30 16:35:10+00:001355555164686413826NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN613NaN4375297954 2487093234 710000000000000000NaN0NaNTwitter Web App
5007071829103732021-05-04 00:16:16+00:001389373273935433731NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN34116NaN1190000000000000000 3097165470 102282721.389222e+180NaNTwitter for Android
5007085176453532021-05-17 11:00:35+00:001394246465724289024NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN210NaNNaNNaN0NaNTwitter for iPhone
500709485346422021-10-04 17:49:32+00:001445083681991843857NaNNaNNaNNaN['Turkey']NaNNaNNaNNaNNaNNaNNaN3481NaNNaNNaN0NaNTwitter Web App